Message ID | 20170814070313.20092-1-lufq.fnst@cn.fujitsu.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, Aug 14, 2017 at 03:03:13PM +0800, Lu Fengqi wrote: > I catch this following error from dmesg when this testcase fails. > > [17446.661127] Buffer I/O error on dev sdb1, logical block 64, async page read > > We expect to inject disk IO errors on the device when xfs_io reads the > specific file, but other processes may trigger IO error earlier. So, we > can use task-filter to solve this problem. > > Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com> This looks OK to me. Does btrfs/143 need a similar fix? Thanks, Eryu -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Aug 15, 2017 at 05:16:06PM +0800, Eryu Guan wrote: >On Mon, Aug 14, 2017 at 03:03:13PM +0800, Lu Fengqi wrote: >> I catch this following error from dmesg when this testcase fails. >> >> [17446.661127] Buffer I/O error on dev sdb1, logical block 64, async page read >> >> We expect to inject disk IO errors on the device when xfs_io reads the >> specific file, but other processes may trigger IO error earlier. So, we >> can use task-filter to solve this problem. >> >> Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com> > >This looks OK to me. Does btrfs/143 need a similar fix? > >Thanks, >Eryu > > Although btrfs/143 has a similar problem, this method doesn't work for it. Unlike btrfs/142, the second IO error needs to be triggered by another process, not by xfs_io. So we can't simply set task-filter to solve this problem. I'm still investigating a way to solve the problem. Do you have any ideas? Any help will be greatly appreciated.
diff --git a/tests/btrfs/142 b/tests/btrfs/142 index 414af1b2..5bd8b728 100755 --- a/tests/btrfs/142 +++ b/tests/btrfs/142 @@ -75,6 +75,7 @@ start_fail() { echo 100 > $DEBUGFS_MNT/fail_make_request/probability echo 2 > $DEBUGFS_MNT/fail_make_request/times + echo 1 > $DEBUGFS_MNT/fail_make_request/task-filter echo 0 > $DEBUGFS_MNT/fail_make_request/verbose echo 1 > $SYSFS_BDEV/make-it-fail } @@ -83,6 +84,7 @@ stop_fail() { echo 0 > $DEBUGFS_MNT/fail_make_request/probability echo 0 > $DEBUGFS_MNT/fail_make_request/times + echo 0 > $DEBUGFS_MNT/fail_make_request/task-filter echo 0 > $SYSFS_BDEV/make-it-fail } @@ -118,16 +120,17 @@ echo "step 3......repair the bad copy" >>$seqres.full # since raid1 consists of two copies, and the bad copy was put on stripe #1 # while the good copy lies on stripe #0, the bad copy only gets access when the # reader's pid % 2 == 1 is true -while true; do - # start_fail only fails the following dio read so the repair is +start_fail +while [[ -z ${result} ]]; do + # enable task-filter only fails the following dio read so the repair is # supposed to work. - start_fail - $XFS_IO_PROG -d -c "pread -b 128K 0 128K" "$SCRATCH_MNT/foobar" > /dev/null & - pid=$! - wait - stop_fail - [ $((pid % 2)) == 1 ] && break + result=$(bash -c " + if [[ \$((\$\$ % 2)) -eq 1 ]]; then + echo 1 > /proc/\$\$/make-it-fail + exec $XFS_IO_PROG -d -c \"pread -b 128K 0 128K\" \"$SCRATCH_MNT/foobar\" + fi"); done +stop_fail _scratch_unmount
I catch this following error from dmesg when this testcase fails. [17446.661127] Buffer I/O error on dev sdb1, logical block 64, async page read We expect to inject disk IO errors on the device when xfs_io reads the specific file, but other processes may trigger IO error earlier. So, we can use task-filter to solve this problem. Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com> --- tests/btrfs/142 | 19 +++++++++++-------- 1 file changed, 11 insertions(+), 8 deletions(-)