Message ID | 20170920235243.11822-1-bo.li.liu@oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Wed, Sep 20, 2017 at 05:52:43PM -0600, Liu Bo wrote: > We had a bug in btrfs compression code which could end up with a > kernel panic. > > This is adding a regression test for the bug and I've also sent a > kernel patch to fix the bug. > > The patch is "Btrfs: fix kernel oops while reading compressed data". > > Signed-off-by: Liu Bo <bo.li.liu@oracle.com> > --- > tests/btrfs/150 | 102 ++++++++++++++++++++++++++++++++++++++++++++++++++++ > tests/btrfs/150.out | 3 ++ > tests/btrfs/group | 1 + > 3 files changed, 106 insertions(+) > create mode 100755 tests/btrfs/150 > create mode 100644 tests/btrfs/150.out > > diff --git a/tests/btrfs/150 b/tests/btrfs/150 > new file mode 100755 > index 0000000..834be51 > --- /dev/null > +++ b/tests/btrfs/150 > @@ -0,0 +1,102 @@ > +#! /bin/bash > +# FS QA Test btrfs/150 > +# > +# This is a regression test which ends up with a kernel oops in btrfs. group += dangerous > +# It occurs when btrfs's read repair happens while reading a compressed > +# extent. > +# The patch for this is > +# xxxxx Incomplete? > +# > +#----------------------------------------------------------------------- > +# Copyright (c) 2017 Liu Bo. All Rights Reserved. You're signing off this patch an Oracle employee, but claiming personal copyright. Please clarify who owns the copyright - if it's your personal copyright then please sign off with a personal email address, not your employer's... Also, I note that these recently added tests from you: tests/btrfs/140:# Copyright (c) 2017 Liu Bo. All Rights Reserved. tests/btrfs/141:# Copyright (c) 2017 Liu Bo. All Rights Reserved. tests/btrfs/142:# Copyright (c) 2017 Liu Bo. All Rights Reserved. tests/btrfs/143:# Copyright (c) 2017 Liu Bo. All Rights Reserved. tests/generic/406:# Copyright (c) 2017 Liu Bo. All Rights Reserved. all have this same ambiguity - personal copyright with employer signoff in the commit. This definitely needs clarification and fixing if it is wrong.... > +disable_io_failure() > +{ > + echo 0 > $SYSFS_BDEV/make-it-fail > + echo 0 > $DEBUGFS_MNT/fail_make_request/probability > + echo 0 > $DEBUGFS_MNT/fail_make_request/times > +} > + > +_scratch_pool_mkfs "-d raid1 -b 1G" >> $seqres.full 2>&1 > + > +# It doesn't matter which compression algorithm we use. > +_scratch_mount -ocompress > + > +# Create a file with all data being compressed > +$XFS_IO_PROG -f -c "pwrite -W 0 8K" $SCRATCH_MNT/foobar | _filter_xfs_io needs an fsync to reach disk. > +# Raid1 consists of two copies and btrfs decides which copy to read by reader's > +# %pid. Now we inject errors to copy #1 and copy #0 is good. We want to read > +# the bad copy to trigger read-repair. > +while true; do > + disable_io_failure > + # invalidate the page cache > + $XFS_IO_PROG -f -c "fadvise -d 0 128K" $SCRATCH_MNT/foobar | _filter_xfs_io > + > + enable_io_failure > + od -x $SCRATCH_MNT/foobar > /dev/null & why are you using od to read the data when the output is piped to dev/null? why not just xfs_io -c "pread 0 8k" ? Cheers, Dave.
On Wed, Sep 20, 2017 at 05:52:43PM -0600, Liu Bo wrote: >We had a bug in btrfs compression code which could end up with a >kernel panic. > >This is adding a regression test for the bug and I've also sent a >kernel patch to fix the bug. > >The patch is "Btrfs: fix kernel oops while reading compressed data". > >Signed-off-by: Liu Bo <bo.li.liu@oracle.com> >--- > tests/btrfs/150 | 102 ++++++++++++++++++++++++++++++++++++++++++++++++++++ > tests/btrfs/150.out | 3 ++ > tests/btrfs/group | 1 + > 3 files changed, 106 insertions(+) > create mode 100755 tests/btrfs/150 > create mode 100644 tests/btrfs/150.out > >diff --git a/tests/btrfs/150 b/tests/btrfs/150 >new file mode 100755 >index 0000000..834be51 >--- /dev/null >+++ b/tests/btrfs/150 >@@ -0,0 +1,102 @@ >+#! /bin/bash >+# FS QA Test btrfs/150 >+# >+# This is a regression test which ends up with a kernel oops in btrfs. >+# It occurs when btrfs's read repair happens while reading a compressed >+# extent. >+# The patch for this is >+# xxxxx >+# >+#----------------------------------------------------------------------- >+# Copyright (c) 2017 Liu Bo. All Rights Reserved. >+# >+# This program is free software; you can redistribute it and/or >+# modify it under the terms of the GNU General Public License as >+# published by the Free Software Foundation. >+# >+# This program is distributed in the hope that it would be useful, >+# but WITHOUT ANY WARRANTY; without even the implied warranty of >+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the >+# GNU General Public License for more details. >+# >+# You should have received a copy of the GNU General Public License >+# along with this program; if not, write the Free Software Foundation, >+# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA >+#----------------------------------------------------------------------- >+# >+ >+seq=`basename $0` >+seqres=$RESULT_DIR/$seq >+echo "QA output created by $seq" >+ >+here=`pwd` >+tmp=/tmp/$$ >+status=1 # failure is the default! >+trap "_cleanup; exit \$status" 0 1 2 3 15 >+ >+_cleanup() >+{ >+ cd / >+ rm -f $tmp.* >+} >+ >+# get standard environment, filters and checks >+. ./common/rc >+. ./common/filter >+ >+# remove previous $seqres.full before test >+rm -f $seqres.full >+ >+# real QA test starts here >+ >+# Modify as appropriate. >+_supported_fs btrfs >+_supported_os Linux >+_require_scratch >+_require_fail_make_request >+_require_scratch_dev_pool 2 >+ >+SYSFS_BDEV=`_sysfs_dev $SCRATCH_DEV` >+enable_io_failure() >+{ >+ echo 100 > $DEBUGFS_MNT/fail_make_request/probability >+ echo 1000 > $DEBUGFS_MNT/fail_make_request/times What does 1000 mean? Enough failures? Why not set times to -1? >+ echo 0 > $DEBUGFS_MNT/fail_make_request/verbose >+ echo 1 > $SYSFS_BDEV/make-it-fail >+} >+ >+disable_io_failure() >+{ >+ echo 0 > $SYSFS_BDEV/make-it-fail >+ echo 0 > $DEBUGFS_MNT/fail_make_request/probability >+ echo 0 > $DEBUGFS_MNT/fail_make_request/times >+} >+ >+_scratch_pool_mkfs "-d raid1 -b 1G" >> $seqres.full 2>&1 >+ >+# It doesn't matter which compression algorithm we use. >+_scratch_mount -ocompress >+ >+# Create a file with all data being compressed >+$XFS_IO_PROG -f -c "pwrite -W 0 8K" $SCRATCH_MNT/foobar | _filter_xfs_io >+ >+# Raid1 consists of two copies and btrfs decides which copy to read by reader's >+# %pid. Now we inject errors to copy #1 and copy #0 is good. We want to read >+# the bad copy to trigger read-repair. >+while true; do >+ disable_io_failure >+ # invalidate the page cache >+ $XFS_IO_PROG -f -c "fadvise -d 0 128K" $SCRATCH_MNT/foobar | _filter_xfs_io >+ >+ enable_io_failure >+ od -x $SCRATCH_MNT/foobar > /dev/null & >+ pid=$! >+ wait >+ [ $((pid % 2)) == 1 ] && break >+done >+ >+disable_io_failure >+ >+# success, all done >+status=0 >+exit >diff --git a/tests/btrfs/150.out b/tests/btrfs/150.out >new file mode 100644 >index 0000000..c492c24 >--- /dev/null >+++ b/tests/btrfs/150.out >@@ -0,0 +1,3 @@ >+QA output created by 150 >+wrote 8192/8192 bytes at offset 0 >+XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) >diff --git a/tests/btrfs/group b/tests/btrfs/group >index 70c3f05..b70a122 100644 >--- a/tests/btrfs/group >+++ b/tests/btrfs/group >@@ -152,3 +152,4 @@ > 147 auto quick send > 148 auto quick rw > 149 auto quick send compress >+150 auto quick >-- >2.5.0 > >-- >To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >the body of a message to majordomo@vger.kernel.org >More majordomo info at http://vger.kernel.org/majordomo-info.html > >
On Thu, Sep 21, 2017 at 03:03:45PM +0800, Lu Fengqi wrote: > On Wed, Sep 20, 2017 at 05:52:43PM -0600, Liu Bo wrote: > >We had a bug in btrfs compression code which could end up with a > >kernel panic. > > > >This is adding a regression test for the bug and I've also sent a > >kernel patch to fix the bug. > > > >The patch is "Btrfs: fix kernel oops while reading compressed data". > > > >Signed-off-by: Liu Bo <bo.li.liu@oracle.com> > >--- > > tests/btrfs/150 | 102 ++++++++++++++++++++++++++++++++++++++++++++++++++++ > > tests/btrfs/150.out | 3 ++ > > tests/btrfs/group | 1 + > > 3 files changed, 106 insertions(+) > > create mode 100755 tests/btrfs/150 > > create mode 100644 tests/btrfs/150.out > > > >diff --git a/tests/btrfs/150 b/tests/btrfs/150 > >new file mode 100755 > >index 0000000..834be51 > >--- /dev/null > >+++ b/tests/btrfs/150 > >@@ -0,0 +1,102 @@ > >+#! /bin/bash > >+# FS QA Test btrfs/150 > >+# > >+# This is a regression test which ends up with a kernel oops in btrfs. > >+# It occurs when btrfs's read repair happens while reading a compressed > >+# extent. > >+# The patch for this is > >+# xxxxx > >+# > >+#----------------------------------------------------------------------- > >+# Copyright (c) 2017 Liu Bo. All Rights Reserved. > >+# > >+# This program is free software; you can redistribute it and/or > >+# modify it under the terms of the GNU General Public License as > >+# published by the Free Software Foundation. > >+# > >+# This program is distributed in the hope that it would be useful, > >+# but WITHOUT ANY WARRANTY; without even the implied warranty of > >+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > >+# GNU General Public License for more details. > >+# > >+# You should have received a copy of the GNU General Public License > >+# along with this program; if not, write the Free Software Foundation, > >+# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA > >+#----------------------------------------------------------------------- > >+# > >+ > >+seq=`basename $0` > >+seqres=$RESULT_DIR/$seq > >+echo "QA output created by $seq" > >+ > >+here=`pwd` > >+tmp=/tmp/$$ > >+status=1 # failure is the default! > >+trap "_cleanup; exit \$status" 0 1 2 3 15 > >+ > >+_cleanup() > >+{ > >+ cd / > >+ rm -f $tmp.* > >+} > >+ > >+# get standard environment, filters and checks > >+. ./common/rc > >+. ./common/filter > >+ > >+# remove previous $seqres.full before test > >+rm -f $seqres.full > >+ > >+# real QA test starts here > >+ > >+# Modify as appropriate. > >+_supported_fs btrfs > >+_supported_os Linux > >+_require_scratch > >+_require_fail_make_request > >+_require_scratch_dev_pool 2 > >+ > >+SYSFS_BDEV=`_sysfs_dev $SCRATCH_DEV` > >+enable_io_failure() > >+{ > >+ echo 100 > $DEBUGFS_MNT/fail_make_request/probability > >+ echo 1000 > $DEBUGFS_MNT/fail_make_request/times > > What does 1000 mean? Enough failures? > Why not set times to -1? This was copied from another test, so I kept it as is. As this test just submits a single 8K read after enabling fault injection, 1000 is in fact as same as -1(no limit), I think 1000 is OK to use. thanks, -liubo > > >+ echo 0 > $DEBUGFS_MNT/fail_make_request/verbose > >+ echo 1 > $SYSFS_BDEV/make-it-fail > >+} > >+ > >+disable_io_failure() > >+{ > >+ echo 0 > $SYSFS_BDEV/make-it-fail > >+ echo 0 > $DEBUGFS_MNT/fail_make_request/probability > >+ echo 0 > $DEBUGFS_MNT/fail_make_request/times > >+} > >+ > >+_scratch_pool_mkfs "-d raid1 -b 1G" >> $seqres.full 2>&1 > >+ > >+# It doesn't matter which compression algorithm we use. > >+_scratch_mount -ocompress > >+ > >+# Create a file with all data being compressed > >+$XFS_IO_PROG -f -c "pwrite -W 0 8K" $SCRATCH_MNT/foobar | _filter_xfs_io > >+ > >+# Raid1 consists of two copies and btrfs decides which copy to read by reader's > >+# %pid. Now we inject errors to copy #1 and copy #0 is good. We want to read > >+# the bad copy to trigger read-repair. > >+while true; do > >+ disable_io_failure > >+ # invalidate the page cache > >+ $XFS_IO_PROG -f -c "fadvise -d 0 128K" $SCRATCH_MNT/foobar | _filter_xfs_io > >+ > >+ enable_io_failure > >+ od -x $SCRATCH_MNT/foobar > /dev/null & > >+ pid=$! > >+ wait > >+ [ $((pid % 2)) == 1 ] && break > >+done > >+ > >+disable_io_failure > >+ > >+# success, all done > >+status=0 > >+exit > >diff --git a/tests/btrfs/150.out b/tests/btrfs/150.out > >new file mode 100644 > >index 0000000..c492c24 > >--- /dev/null > >+++ b/tests/btrfs/150.out > >@@ -0,0 +1,3 @@ > >+QA output created by 150 > >+wrote 8192/8192 bytes at offset 0 > >+XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) > >diff --git a/tests/btrfs/group b/tests/btrfs/group > >index 70c3f05..b70a122 100644 > >--- a/tests/btrfs/group > >+++ b/tests/btrfs/group > >@@ -152,3 +152,4 @@ > > 147 auto quick send > > 148 auto quick rw > > 149 auto quick send compress > >+150 auto quick > >-- > >2.5.0 > > > >-- > >To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > >the body of a message to majordomo@vger.kernel.org > >More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > > > -- > Thanks, > Lu > > -- To unsubscribe from this list: send the line "unsubscribe fstests" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Sep 21, 2017 at 02:39:52PM +1000, Dave Chinner wrote: > On Wed, Sep 20, 2017 at 05:52:43PM -0600, Liu Bo wrote: > > We had a bug in btrfs compression code which could end up with a > > kernel panic. > > > > This is adding a regression test for the bug and I've also sent a > > kernel patch to fix the bug. > > > > The patch is "Btrfs: fix kernel oops while reading compressed data". > > > > Signed-off-by: Liu Bo <bo.li.liu@oracle.com> > > --- > > tests/btrfs/150 | 102 ++++++++++++++++++++++++++++++++++++++++++++++++++++ > > tests/btrfs/150.out | 3 ++ > > tests/btrfs/group | 1 + > > 3 files changed, 106 insertions(+) > > create mode 100755 tests/btrfs/150 > > create mode 100644 tests/btrfs/150.out > > > > diff --git a/tests/btrfs/150 b/tests/btrfs/150 > > new file mode 100755 > > index 0000000..834be51 > > --- /dev/null > > +++ b/tests/btrfs/150 > > @@ -0,0 +1,102 @@ > > +#! /bin/bash > > +# FS QA Test btrfs/150 > > +# > > +# This is a regression test which ends up with a kernel oops in btrfs. > > group += dangerous OK. > > > +# It occurs when btrfs's read repair happens while reading a compressed > > +# extent. > > +# The patch for this is > > +# xxxxx > > Incomplete? Urr, thanks for pointing it out. > > > +# > > +#----------------------------------------------------------------------- > > +# Copyright (c) 2017 Liu Bo. All Rights Reserved. > > You're signing off this patch an Oracle employee, but claiming > personal copyright. Please clarify who owns the copyright - if it's > your personal copyright then please sign off with a personal email > address, not your employer's... > > Also, I note that these recently added tests from you: > > tests/btrfs/140:# Copyright (c) 2017 Liu Bo. All Rights Reserved. > tests/btrfs/141:# Copyright (c) 2017 Liu Bo. All Rights Reserved. > tests/btrfs/142:# Copyright (c) 2017 Liu Bo. All Rights Reserved. > tests/btrfs/143:# Copyright (c) 2017 Liu Bo. All Rights Reserved. > tests/generic/406:# Copyright (c) 2017 Liu Bo. All Rights Reserved. > > all have this same ambiguity - personal copyright with employer > signoff in the commit. This definitely needs clarification and > fixing if it is wrong.... > All right, will fix all of them (in a separate one). > > > +disable_io_failure() > > +{ > > + echo 0 > $SYSFS_BDEV/make-it-fail > > + echo 0 > $DEBUGFS_MNT/fail_make_request/probability > > + echo 0 > $DEBUGFS_MNT/fail_make_request/times > > +} > > + > > +_scratch_pool_mkfs "-d raid1 -b 1G" >> $seqres.full 2>&1 > > + > > +# It doesn't matter which compression algorithm we use. > > +_scratch_mount -ocompress > > + > > +# Create a file with all data being compressed > > +$XFS_IO_PROG -f -c "pwrite -W 0 8K" $SCRATCH_MNT/foobar | _filter_xfs_io > > needs an fsync to reach disk. 'pwrite -W' has ensured that. > > > +# Raid1 consists of two copies and btrfs decides which copy to read by reader's > > +# %pid. Now we inject errors to copy #1 and copy #0 is good. We want to read > > +# the bad copy to trigger read-repair. > > +while true; do > > + disable_io_failure > > + # invalidate the page cache > > + $XFS_IO_PROG -f -c "fadvise -d 0 128K" $SCRATCH_MNT/foobar | _filter_xfs_io > > + > > + enable_io_failure > > + od -x $SCRATCH_MNT/foobar > /dev/null & > > why are you using od to read the data when the output is piped to > dev/null? why not just xfs_io -c "pread 0 8k" ? Oh yes, that's better, will do. thanks, -liubo -- To unsubscribe from this list: send the line "unsubscribe fstests" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/tests/btrfs/150 b/tests/btrfs/150 new file mode 100755 index 0000000..834be51 --- /dev/null +++ b/tests/btrfs/150 @@ -0,0 +1,102 @@ +#! /bin/bash +# FS QA Test btrfs/150 +# +# This is a regression test which ends up with a kernel oops in btrfs. +# It occurs when btrfs's read repair happens while reading a compressed +# extent. +# The patch for this is +# xxxxx +# +#----------------------------------------------------------------------- +# Copyright (c) 2017 Liu Bo. All Rights Reserved. +# +# This program is free software; you can redistribute it and/or +# modify it under the terms of the GNU General Public License as +# published by the Free Software Foundation. +# +# This program is distributed in the hope that it would be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; if not, write the Free Software Foundation, +# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +#----------------------------------------------------------------------- +# + +seq=`basename $0` +seqres=$RESULT_DIR/$seq +echo "QA output created by $seq" + +here=`pwd` +tmp=/tmp/$$ +status=1 # failure is the default! +trap "_cleanup; exit \$status" 0 1 2 3 15 + +_cleanup() +{ + cd / + rm -f $tmp.* +} + +# get standard environment, filters and checks +. ./common/rc +. ./common/filter + +# remove previous $seqres.full before test +rm -f $seqres.full + +# real QA test starts here + +# Modify as appropriate. +_supported_fs btrfs +_supported_os Linux +_require_scratch +_require_fail_make_request +_require_scratch_dev_pool 2 + +SYSFS_BDEV=`_sysfs_dev $SCRATCH_DEV` +enable_io_failure() +{ + echo 100 > $DEBUGFS_MNT/fail_make_request/probability + echo 1000 > $DEBUGFS_MNT/fail_make_request/times + echo 0 > $DEBUGFS_MNT/fail_make_request/verbose + echo 1 > $SYSFS_BDEV/make-it-fail +} + +disable_io_failure() +{ + echo 0 > $SYSFS_BDEV/make-it-fail + echo 0 > $DEBUGFS_MNT/fail_make_request/probability + echo 0 > $DEBUGFS_MNT/fail_make_request/times +} + +_scratch_pool_mkfs "-d raid1 -b 1G" >> $seqres.full 2>&1 + +# It doesn't matter which compression algorithm we use. +_scratch_mount -ocompress + +# Create a file with all data being compressed +$XFS_IO_PROG -f -c "pwrite -W 0 8K" $SCRATCH_MNT/foobar | _filter_xfs_io + +# Raid1 consists of two copies and btrfs decides which copy to read by reader's +# %pid. Now we inject errors to copy #1 and copy #0 is good. We want to read +# the bad copy to trigger read-repair. +while true; do + disable_io_failure + # invalidate the page cache + $XFS_IO_PROG -f -c "fadvise -d 0 128K" $SCRATCH_MNT/foobar | _filter_xfs_io + + enable_io_failure + od -x $SCRATCH_MNT/foobar > /dev/null & + pid=$! + wait + [ $((pid % 2)) == 1 ] && break +done + +disable_io_failure + +# success, all done +status=0 +exit diff --git a/tests/btrfs/150.out b/tests/btrfs/150.out new file mode 100644 index 0000000..c492c24 --- /dev/null +++ b/tests/btrfs/150.out @@ -0,0 +1,3 @@ +QA output created by 150 +wrote 8192/8192 bytes at offset 0 +XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) diff --git a/tests/btrfs/group b/tests/btrfs/group index 70c3f05..b70a122 100644 --- a/tests/btrfs/group +++ b/tests/btrfs/group @@ -152,3 +152,4 @@ 147 auto quick send 148 auto quick rw 149 auto quick send compress +150 auto quick
We had a bug in btrfs compression code which could end up with a kernel panic. This is adding a regression test for the bug and I've also sent a kernel patch to fix the bug. The patch is "Btrfs: fix kernel oops while reading compressed data". Signed-off-by: Liu Bo <bo.li.liu@oracle.com> --- tests/btrfs/150 | 102 ++++++++++++++++++++++++++++++++++++++++++++++++++++ tests/btrfs/150.out | 3 ++ tests/btrfs/group | 1 + 3 files changed, 106 insertions(+) create mode 100755 tests/btrfs/150 create mode 100644 tests/btrfs/150.out