Message ID | 30fa381cac3dcc03b6fae4b9db5bf6c9a01f7bf6.1645549521.git.ojaswin@linux.ibm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | tests/ext4: Ensure resizes with sparse_super2 are handled correctly | expand |
On Tue, Feb 22, 2022 at 11:20:53PM +0530, Ojaswin Mujoo wrote: > Kernel currently doesn't support resize of EXT4 mounted > with sparse_super2 option enabled. Earlier, it used to leave the resize > incomplete and the fs would be left in an inconsistent state, however commit > b1489186cc83[1] fixed this to avoid the fs corruption by clearly returning > -EOPNOTSUPP. > > Test to ensure that kernel handles resizing with sparse_super2 correctly. Run > resize for multiple iterations because this sometimes leads to kernel crash due > to fs corruption, which we want to detect. > > Related commit in mainline: > > [1] commit b1489186cc8391e0c1e342f9fbc3eedf6b944c61 > > ext4: add check to prevent attempting to resize an fs with sparse_super2 > > Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> > --- > > I would like to add a few comments on the approach followed in this > test: > > 1. So although we check the return codes of the resize operation for > proper logging, the test is only considered to be passed if fsck > passes after the resize. This is because resizing a patched kernel > always keeps the fs consistent whereas resizing on unpatched kernel > always corrupts the fs. > > 2. I've noticed that running mkfs + resize multiple times on unpatched > kernel sometimes results in KASAN reporting use-after-free. Hence, if > we detect the kernel is unpatched (doesn't return EOPNOTSUPP on > resize) we continue iterating to capture this. In this case, we don't > run fsck in each iteration but run it only after all iterations are > complete because _check_scratch_fs exits the test if it fails. > > 3. In case we detect patched kernel, we stop iterating, and run fsck to > confirm success > > tests/ext4/056 | 108 +++++++++++++++++++++++++++++++++++++++++++++ > tests/ext4/056.out | 2 + > 2 files changed, 110 insertions(+) > create mode 100755 tests/ext4/056 > create mode 100644 tests/ext4/056.out > > diff --git a/tests/ext4/056 b/tests/ext4/056 > new file mode 100755 > index 00000000..0f275dea > --- /dev/null > +++ b/tests/ext4/056 > @@ -0,0 +1,108 @@ > +#! /bin/bash > +# SPDX-License-Identifier: GPL-2.0 > +# Copyright (c) 2022 IBM. All Rights Reserved. > +# > +# We don't currently support resize of EXT4 filesystems mounted > +# with sparse_super2 option enabled. Earlier, kernel used to leave the resize > +# incomplete and the fs would be left into an incomplete state, however commit > +# b1489186cc83[1] fixed this to avoid the fs corruption by clearly returning > +# -ENOTSUPP. > +# > +# This test ensures that kernel handles resizing with sparse_super2 correctly > +# > +# Related commit in mainline: > +# > +# [1] commit b1489186cc8391e0c1e342f9fbc3eedf6b944c61 > +# ext4: add check to prevent attempting to resize an fs with sparse_super2 > +# > + > +. ./common/preamble > +_begin_fstest auto ioctl resize quick > + > +# real QA test starts here > + > +INITIAL_FS_SIZE=1G > +RESIZED_FS_SIZE=$((2*1024*1024*1024)) # 2G > +ONLINE_RESIZE_BLOCK_LIMIT=$((256*1024*1024)) > + > +STOP_ITER=255 # Arbitrary return code > + > +_supported_fs ext4 > +_require_scratch_size $(($RESIZED_FS_SIZE/1024)) > +_require_test_program "ext4_resize" > + > +log() > +{ > + echo "$seq: $*" >> $seqres.full 2>&1 > +} > + > +do_resize() > +{ > + _mkfs_dev -E resize=$ONLINE_RESIZE_BLOCK_LIMIT -O sparse_super2 \ > + $SCRATCH_DEV $INITIAL_FS_SIZE >> $seqres.full 2>&1 \ > + || _fail "$MKFS_PROG failed. Exiting" > + > + _scratch_mount || _fail "Failed to mount scratch partition. Exiting" > + > + local BS=$(_get_block_size $SCRATCH_MNT) > + local REQUIRED_BLOCKS=$(($RESIZED_FS_SIZE/$BS)) > + > + local RESIZE_RET > + local EOPNOTSUPP=95 > + > + log "Resizing FS" > + $here/src/ext4_resize $SCRATCH_MNT $REQUIRED_BLOCKS >> $seqres.full 2>&1 > + RESIZE_RET=$? > + > + # Test appears to be successful. Stop iterating and confirm if FS is > + # consistent. > + if [ $RESIZE_RET = $EOPNOTSUPP ] > + then > + log "Resize operation not supported with sparse_super2" > + log "Threw expected error $RESIZE_RET (EOPNOTSUPP)" > + return $STOP_ITER > + fi > + > + # Test appears to be unsuccessful. Most probably, the fs is corrupt, > + # iterate a few more times to further stress test this. > + log "Something is wrong. Output of resize = $RESIZE_RET. \ > + Expected $EOPNOTSUPP (EOPNOTSUPP)" > + > + # unmount after ioctl sometimes fails with "device busy" so add a small > + # delay > + sleep 0.2 > + _scratch_unmount >> $seqres.full 2>&1 \ > + || _fail "$UMOUNT_PROG failed. Exiting" > +} > + > +run_test() > +{ > + local FSCK_RET > + local ITERS=8 > + local RET=0 > + > + for i in $(seq 1 $ITERS) > + do > + log "----------- Iteration: $i ------------" > + do_resize > + RET=$? > + > + [ "$RET" = "$STOP_ITER" ] && break > + done > + > + log "-------- Iterations Complete ---------" > + log "Checking if FS is in consistent state" > + _check_scratch_fs _check_scratch_fs will exit the test on failure and print error message, which will break the golden image, so there's no need to check fsck ret. > + FSCK_RET=$? > + > + [ "$FSCK_RET" -ne "0" ] && \ > + echo "fs corrupt. Check $seqres.full for more details" > + > + return $FSCK_RET So I removed above hunk on commit. Thanks for the test! And my apology to the HUGE delay on review.. Thanks, Eryu > +} > + > +echo "Silence is golden" > +run_test > +status=$? > + > +exit > diff --git a/tests/ext4/056.out b/tests/ext4/056.out > new file mode 100644 > index 00000000..6142fcd2 > --- /dev/null > +++ b/tests/ext4/056.out > @@ -0,0 +1,2 @@ > +QA output created by 056 > +Silence is golden > -- > 2.27.0
On Sun, Apr 10, 2022 at 06:36:14PM +0800, Eryu Guan wrote: > On Tue, Feb 22, 2022 at 11:20:53PM +0530, Ojaswin Mujoo wrote: > > Kernel currently doesn't support resize of EXT4 mounted > > with sparse_super2 option enabled. Earlier, it used to leave the resize > > incomplete and the fs would be left in an inconsistent state, however commit > > b1489186cc83[1] fixed this to avoid the fs corruption by clearly returning > > -EOPNOTSUPP. > > > > Test to ensure that kernel handles resizing with sparse_super2 correctly. Run > > resize for multiple iterations because this sometimes leads to kernel crash due > > to fs corruption, which we want to detect. > > > > Related commit in mainline: > > > > [1] commit b1489186cc8391e0c1e342f9fbc3eedf6b944c61 > > > > ext4: add check to prevent attempting to resize an fs with sparse_super2 > > > > Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> > > --- > > > > I would like to add a few comments on the approach followed in this > > test: > > > > 1. So although we check the return codes of the resize operation for > > proper logging, the test is only considered to be passed if fsck > > passes after the resize. This is because resizing a patched kernel > > always keeps the fs consistent whereas resizing on unpatched kernel > > always corrupts the fs. > > > > 2. I've noticed that running mkfs + resize multiple times on unpatched > > kernel sometimes results in KASAN reporting use-after-free. Hence, if > > we detect the kernel is unpatched (doesn't return EOPNOTSUPP on > > resize) we continue iterating to capture this. In this case, we don't > > run fsck in each iteration but run it only after all iterations are > > complete because _check_scratch_fs exits the test if it fails. > > > > 3. In case we detect patched kernel, we stop iterating, and run fsck to > > confirm success > > > > tests/ext4/056 | 108 +++++++++++++++++++++++++++++++++++++++++++++ > > tests/ext4/056.out | 2 + > > 2 files changed, 110 insertions(+) > > create mode 100755 tests/ext4/056 > > create mode 100644 tests/ext4/056.out > > > > diff --git a/tests/ext4/056 b/tests/ext4/056 > > new file mode 100755 > > index 00000000..0f275dea > > --- /dev/null > > +++ b/tests/ext4/056 > > @@ -0,0 +1,108 @@ > > +#! /bin/bash > > +# SPDX-License-Identifier: GPL-2.0 > > +# Copyright (c) 2022 IBM. All Rights Reserved. > > +# > > +# We don't currently support resize of EXT4 filesystems mounted > > +# with sparse_super2 option enabled. Earlier, kernel used to leave the resize > > +# incomplete and the fs would be left into an incomplete state, however commit > > +# b1489186cc83[1] fixed this to avoid the fs corruption by clearly returning > > +# -ENOTSUPP. > > +# > > +# This test ensures that kernel handles resizing with sparse_super2 correctly > > +# > > +# Related commit in mainline: > > +# > > +# [1] commit b1489186cc8391e0c1e342f9fbc3eedf6b944c61 > > +# ext4: add check to prevent attempting to resize an fs with sparse_super2 > > +# > > + > > +. ./common/preamble > > +_begin_fstest auto ioctl resize quick > > + > > +# real QA test starts here > > + > > +INITIAL_FS_SIZE=1G > > +RESIZED_FS_SIZE=$((2*1024*1024*1024)) # 2G > > +ONLINE_RESIZE_BLOCK_LIMIT=$((256*1024*1024)) > > + > > +STOP_ITER=255 # Arbitrary return code > > + > > +_supported_fs ext4 > > +_require_scratch_size $(($RESIZED_FS_SIZE/1024)) > > +_require_test_program "ext4_resize" > > + > > +log() > > +{ > > + echo "$seq: $*" >> $seqres.full 2>&1 > > +} > > + > > +do_resize() > > +{ > > + _mkfs_dev -E resize=$ONLINE_RESIZE_BLOCK_LIMIT -O sparse_super2 \ > > + $SCRATCH_DEV $INITIAL_FS_SIZE >> $seqres.full 2>&1 \ > > + || _fail "$MKFS_PROG failed. Exiting" > > + > > + _scratch_mount || _fail "Failed to mount scratch partition. Exiting" > > + > > + local BS=$(_get_block_size $SCRATCH_MNT) > > + local REQUIRED_BLOCKS=$(($RESIZED_FS_SIZE/$BS)) > > + > > + local RESIZE_RET > > + local EOPNOTSUPP=95 > > + > > + log "Resizing FS" > > + $here/src/ext4_resize $SCRATCH_MNT $REQUIRED_BLOCKS >> $seqres.full 2>&1 > > + RESIZE_RET=$? > > + > > + # Test appears to be successful. Stop iterating and confirm if FS is > > + # consistent. > > + if [ $RESIZE_RET = $EOPNOTSUPP ] > > + then > > + log "Resize operation not supported with sparse_super2" > > + log "Threw expected error $RESIZE_RET (EOPNOTSUPP)" > > + return $STOP_ITER > > + fi > > + > > + # Test appears to be unsuccessful. Most probably, the fs is corrupt, > > + # iterate a few more times to further stress test this. > > + log "Something is wrong. Output of resize = $RESIZE_RET. \ > > + Expected $EOPNOTSUPP (EOPNOTSUPP)" > > + > > + # unmount after ioctl sometimes fails with "device busy" so add a small > > + # delay > > + sleep 0.2 > > + _scratch_unmount >> $seqres.full 2>&1 \ > > + || _fail "$UMOUNT_PROG failed. Exiting" > > +} > > + > > +run_test() > > +{ > > + local FSCK_RET > > + local ITERS=8 > > + local RET=0 > > + > > + for i in $(seq 1 $ITERS) > > + do > > + log "----------- Iteration: $i ------------" > > + do_resize > > + RET=$? > > + > > + [ "$RET" = "$STOP_ITER" ] && break > > + done > > + > > + log "-------- Iterations Complete ---------" > > + log "Checking if FS is in consistent state" > > + _check_scratch_fs > > _check_scratch_fs will exit the test on failure and print error message, > which will break the golden image, so there's no need to check fsck ret. > > > + FSCK_RET=$? > > + > > + [ "$FSCK_RET" -ne "0" ] && \ > > + echo "fs corrupt. Check $seqres.full for more details" > > + > > + return $FSCK_RET > > So I removed above hunk on commit. > > Thanks for the test! And my apology to the HUGE delay on review.. No worries Eryu, thanks a lot for the review :) Regards, Ojaswin > > Thanks, > Eryu > > > +} > > + > > +echo "Silence is golden" > > +run_test > > +status=$? > > + > > +exit > > diff --git a/tests/ext4/056.out b/tests/ext4/056.out > > new file mode 100644 > > index 00000000..6142fcd2 > > --- /dev/null > > +++ b/tests/ext4/056.out > > @@ -0,0 +1,2 @@ > > +QA output created by 056 > > +Silence is golden > > -- > > 2.27.0
diff --git a/tests/ext4/056 b/tests/ext4/056 new file mode 100755 index 00000000..0f275dea --- /dev/null +++ b/tests/ext4/056 @@ -0,0 +1,108 @@ +#! /bin/bash +# SPDX-License-Identifier: GPL-2.0 +# Copyright (c) 2022 IBM. All Rights Reserved. +# +# We don't currently support resize of EXT4 filesystems mounted +# with sparse_super2 option enabled. Earlier, kernel used to leave the resize +# incomplete and the fs would be left into an incomplete state, however commit +# b1489186cc83[1] fixed this to avoid the fs corruption by clearly returning +# -ENOTSUPP. +# +# This test ensures that kernel handles resizing with sparse_super2 correctly +# +# Related commit in mainline: +# +# [1] commit b1489186cc8391e0c1e342f9fbc3eedf6b944c61 +# ext4: add check to prevent attempting to resize an fs with sparse_super2 +# + +. ./common/preamble +_begin_fstest auto ioctl resize quick + +# real QA test starts here + +INITIAL_FS_SIZE=1G +RESIZED_FS_SIZE=$((2*1024*1024*1024)) # 2G +ONLINE_RESIZE_BLOCK_LIMIT=$((256*1024*1024)) + +STOP_ITER=255 # Arbitrary return code + +_supported_fs ext4 +_require_scratch_size $(($RESIZED_FS_SIZE/1024)) +_require_test_program "ext4_resize" + +log() +{ + echo "$seq: $*" >> $seqres.full 2>&1 +} + +do_resize() +{ + _mkfs_dev -E resize=$ONLINE_RESIZE_BLOCK_LIMIT -O sparse_super2 \ + $SCRATCH_DEV $INITIAL_FS_SIZE >> $seqres.full 2>&1 \ + || _fail "$MKFS_PROG failed. Exiting" + + _scratch_mount || _fail "Failed to mount scratch partition. Exiting" + + local BS=$(_get_block_size $SCRATCH_MNT) + local REQUIRED_BLOCKS=$(($RESIZED_FS_SIZE/$BS)) + + local RESIZE_RET + local EOPNOTSUPP=95 + + log "Resizing FS" + $here/src/ext4_resize $SCRATCH_MNT $REQUIRED_BLOCKS >> $seqres.full 2>&1 + RESIZE_RET=$? + + # Test appears to be successful. Stop iterating and confirm if FS is + # consistent. + if [ $RESIZE_RET = $EOPNOTSUPP ] + then + log "Resize operation not supported with sparse_super2" + log "Threw expected error $RESIZE_RET (EOPNOTSUPP)" + return $STOP_ITER + fi + + # Test appears to be unsuccessful. Most probably, the fs is corrupt, + # iterate a few more times to further stress test this. + log "Something is wrong. Output of resize = $RESIZE_RET. \ + Expected $EOPNOTSUPP (EOPNOTSUPP)" + + # unmount after ioctl sometimes fails with "device busy" so add a small + # delay + sleep 0.2 + _scratch_unmount >> $seqres.full 2>&1 \ + || _fail "$UMOUNT_PROG failed. Exiting" +} + +run_test() +{ + local FSCK_RET + local ITERS=8 + local RET=0 + + for i in $(seq 1 $ITERS) + do + log "----------- Iteration: $i ------------" + do_resize + RET=$? + + [ "$RET" = "$STOP_ITER" ] && break + done + + log "-------- Iterations Complete ---------" + log "Checking if FS is in consistent state" + _check_scratch_fs + FSCK_RET=$? + + [ "$FSCK_RET" -ne "0" ] && \ + echo "fs corrupt. Check $seqres.full for more details" + + return $FSCK_RET +} + +echo "Silence is golden" +run_test +status=$? + +exit diff --git a/tests/ext4/056.out b/tests/ext4/056.out new file mode 100644 index 00000000..6142fcd2 --- /dev/null +++ b/tests/ext4/056.out @@ -0,0 +1,2 @@ +QA output created by 056 +Silence is golden
Kernel currently doesn't support resize of EXT4 mounted with sparse_super2 option enabled. Earlier, it used to leave the resize incomplete and the fs would be left in an inconsistent state, however commit b1489186cc83[1] fixed this to avoid the fs corruption by clearly returning -EOPNOTSUPP. Test to ensure that kernel handles resizing with sparse_super2 correctly. Run resize for multiple iterations because this sometimes leads to kernel crash due to fs corruption, which we want to detect. Related commit in mainline: [1] commit b1489186cc8391e0c1e342f9fbc3eedf6b944c61 ext4: add check to prevent attempting to resize an fs with sparse_super2 Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> --- I would like to add a few comments on the approach followed in this test: 1. So although we check the return codes of the resize operation for proper logging, the test is only considered to be passed if fsck passes after the resize. This is because resizing a patched kernel always keeps the fs consistent whereas resizing on unpatched kernel always corrupts the fs. 2. I've noticed that running mkfs + resize multiple times on unpatched kernel sometimes results in KASAN reporting use-after-free. Hence, if we detect the kernel is unpatched (doesn't return EOPNOTSUPP on resize) we continue iterating to capture this. In this case, we don't run fsck in each iteration but run it only after all iterations are complete because _check_scratch_fs exits the test if it fails. 3. In case we detect patched kernel, we stop iterating, and run fsck to confirm success tests/ext4/056 | 108 +++++++++++++++++++++++++++++++++++++++++++++ tests/ext4/056.out | 2 + 2 files changed, 110 insertions(+) create mode 100755 tests/ext4/056 create mode 100644 tests/ext4/056.out