Message ID | 20190919150024.8346-1-zlang@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | common/xfs: wipe the XFS superblock of each AGs | expand |
On Thu, Sep 19, 2019 at 11:00:24PM +0800, Zorro Lang wrote: > xfs/030 always fails after d0e484ac699f ("check: wipe scratch devices > between tests") get merged. > > Due to xfs/030 does a sized(100m) mkfs. Before we merge above commit, > mkfs.xfs detects an old primary superblock, it will write zeroes to > all superblocks before formatting the new filesystem. But this won't > be done if we wipe the first superblock(by merging above commit). > > That means if we make a (smaller) sized xfs after wipefs, those *old* > superblocks which created by last time mkfs.xfs will be left on disk. One thing missing from this patch -- if the test formatted the scratch device with non-default geometry, the backup superblocks from that filesystem will not be erased. Going back to my example from the email thread, if the scratch disk has: SB0 [16M zeroes] SB1 [16M zeroes] <4 more AGs> <zeroes from 100M to 1G> \ SB'1 [1G space] SB'2 [1G space] SB'3 [1G space] Where SB[0-5] are the ones written by xfs/030 and SB'[1-3] were written by a previous test that did the default scratch device mkfs, then this patch will wipe out SB'[1-3] and SB0: 000 [16M zeroes] SB1 [16M zeroes] <4 more AGs> <zeroes from 100M to 1G> \ 0000 [1G space] 0000 [1G space] 0000 [1G space] But that still leaves SB[1-5] which xfs_repair could stumble over later. For example, if the next test to be run formats a filesystem with 24MB AGs (instead of 16) and zaps the superblock, then repair will eventually try a linear scan looking for superblocks and find the ones from the 16MB filesystem first. There isn't a sequence of tests that do this, but so long as we're fixing this we might as well zap as much as we can. So I propose adding to try_wipe_scratch_xfs() the following: dbsize= _scratch_xfs_db -c 'sb 0' -c 'p blocksize agblocks agcount' 2>&1 | \ sed -e 's/ = /=/g' -e 's/blocksize/dbsize/g' \ -e 's/agblocks/agsize/g' > $tmp.mkfs . $tmp.mkfs and then repeat the for loop. If there isn't a filesystem then $tmp.mkfs will be an empty file and the loop won't run. > Then when we do xfs_repair, if xfs_repair can't find the first SB, it > will go to find those *old* SB at first. When it finds them, > everyting goes wrong. > > So I try to get XFS AG geometry(by default) and then try to erase all > superblocks. Thanks Darrick J. Wong helped to analyze this issue. > > Signed-off-by: Zorro Lang <zlang@redhat.com> > --- > common/rc | 4 ++++ > common/xfs | 23 +++++++++++++++++++++++ > 2 files changed, 27 insertions(+) > > diff --git a/common/rc b/common/rc > index 66c7fd4d..fe13f659 100644 > --- a/common/rc > +++ b/common/rc > @@ -4048,6 +4048,10 @@ _try_wipe_scratch_devs() > for dev in $SCRATCH_DEV_POOL $SCRATCH_DEV $SCRATCH_LOGDEV $SCRATCH_RTDEV; do > test -b $dev && $WIPEFS_PROG -a $dev > done > + > + if [ "$FSTYP" = "xfs" ];then > + try_wipe_scratch_xfs > + fi We probably ought to delegate all wiping to try_wipe_scratch_xfs, i.e.: test -b $dev || continue case "$FSTYP" in "xfs") _try_wipe_scratch_xfs ;; *) $WIPEFS_PROG -a $dev ;; esac and add the WIPEFS_PROG call to _try_wipe_scratch_xfs. > } > > # Only run this on xfs if xfs_scrub is available and has the unicode checker > diff --git a/common/xfs b/common/xfs > index 1bce3c18..34516f82 100644 > --- a/common/xfs > +++ b/common/xfs > @@ -884,3 +884,26 @@ _xfs_mount_agcount() > { > $XFS_INFO_PROG "$1" | grep agcount= | sed -e 's/^.*agcount=\([0-9]*\),.*$/\1/g' > } > + > +# wipe the superblock of each XFS AGs > +try_wipe_scratch_xfs() Common helper functions should start with a '_' > +{ > + local tmp=`mktemp -u` > + > + _scratch_mkfs_xfs -N 2>/dev/null | perl -ne ' > + if (/^meta-data=.*\s+agcount=(\d+), agsize=(\d+) blks/) { > + print STDOUT "agcount=$1\nagsize=$2\n"; > + } > + if (/^data\s+=\s+bsize=(\d+)\s/) { > + print STDOUT "dbsize=$1\n"; > + }' > $tmp.mkfs > + > + . $tmp.mkfs > + if [ -n "$agcount" -a -n "$agsize" -a -n "$dbsize" ];then > + for ((i = 0; i < agcount; i++)); do > + $XFS_IO_PROG -c "pwrite $((i * dbsize * agsize)) $dbsize" \ > + $SCRATCH_DEV >/dev/null; > + done > + fi > + rm -f $tmp.mkfs Add code as discussed above. --D > +} > -- > 2.20.1 >
On Thu, Sep 19, 2019 at 09:02:06AM -0700, Darrick J. Wong wrote: > On Thu, Sep 19, 2019 at 11:00:24PM +0800, Zorro Lang wrote: > > xfs/030 always fails after d0e484ac699f ("check: wipe scratch devices > > between tests") get merged. > > > > Due to xfs/030 does a sized(100m) mkfs. Before we merge above commit, > > mkfs.xfs detects an old primary superblock, it will write zeroes to > > all superblocks before formatting the new filesystem. But this won't > > be done if we wipe the first superblock(by merging above commit). > > > > That means if we make a (smaller) sized xfs after wipefs, those *old* > > superblocks which created by last time mkfs.xfs will be left on disk. > > One thing missing from this patch -- if the test formatted the scratch > device with non-default geometry, the backup superblocks from that Make sense, I didn't think about non-default geometry. > filesystem will not be erased. Going back to my example from the email > thread, if the scratch disk has: > > SB0 [16M zeroes] SB1 [16M zeroes] <4 more AGs> <zeroes from 100M to 1G> \ > SB'1 [1G space] SB'2 [1G space] SB'3 [1G space] > > Where SB[0-5] are the ones written by xfs/030 and SB'[1-3] were written > by a previous test that did the default scratch device mkfs, then this > patch will wipe out SB'[1-3] and SB0: > > 000 [16M zeroes] SB1 [16M zeroes] <4 more AGs> <zeroes from 100M to 1G> \ > 0000 [1G space] 0000 [1G space] 0000 [1G space] > > But that still leaves SB[1-5] which xfs_repair could stumble over later. > For example, if the next test to be run formats a filesystem with 24MB > AGs (instead of 16) and zaps the superblock, then repair will eventually > try a linear scan looking for superblocks and find the ones from the > 16MB filesystem first. > > There isn't a sequence of tests that do this, but so long as we're > fixing this we might as well zap as much as we can. So I propose adding > to try_wipe_scratch_xfs() the following: > > dbsize= > _scratch_xfs_db -c 'sb 0' -c 'p blocksize agblocks agcount' 2>&1 | \ > sed -e 's/ = /=/g' -e 's/blocksize/dbsize/g' \ > -e 's/agblocks/agsize/g' > $tmp.mkfs > . $tmp.mkfs > > and then repeat the for loop. If there isn't a filesystem then > $tmp.mkfs will be an empty file and the loop won't run. Sure, although I don't know why we must change the variable's name :) > > > Then when we do xfs_repair, if xfs_repair can't find the first SB, it > > will go to find those *old* SB at first. When it finds them, > > everyting goes wrong. > > > > So I try to get XFS AG geometry(by default) and then try to erase all > > superblocks. Thanks Darrick J. Wong helped to analyze this issue. > > > > Signed-off-by: Zorro Lang <zlang@redhat.com> > > --- > > common/rc | 4 ++++ > > common/xfs | 23 +++++++++++++++++++++++ > > 2 files changed, 27 insertions(+) > > > > diff --git a/common/rc b/common/rc > > index 66c7fd4d..fe13f659 100644 > > --- a/common/rc > > +++ b/common/rc > > @@ -4048,6 +4048,10 @@ _try_wipe_scratch_devs() > > for dev in $SCRATCH_DEV_POOL $SCRATCH_DEV $SCRATCH_LOGDEV $SCRATCH_RTDEV; do > > test -b $dev && $WIPEFS_PROG -a $dev > > done > > + > > + if [ "$FSTYP" = "xfs" ];then > > + try_wipe_scratch_xfs > > + fi > > We probably ought to delegate all wiping to try_wipe_scratch_xfs, i.e.: > > test -b $dev || continue > case "$FSTYP" in > "xfs") > _try_wipe_scratch_xfs > ;; > *) > $WIPEFS_PROG -a $dev > ;; > esac > > and add the WIPEFS_PROG call to _try_wipe_scratch_xfs. Sure, Thanks! Zorro > > > } > > > > # Only run this on xfs if xfs_scrub is available and has the unicode checker > > diff --git a/common/xfs b/common/xfs > > index 1bce3c18..34516f82 100644 > > --- a/common/xfs > > +++ b/common/xfs > > @@ -884,3 +884,26 @@ _xfs_mount_agcount() > > { > > $XFS_INFO_PROG "$1" | grep agcount= | sed -e 's/^.*agcount=\([0-9]*\),.*$/\1/g' > > } > > + > > +# wipe the superblock of each XFS AGs > > +try_wipe_scratch_xfs() > > Common helper functions should start with a '_' > > > +{ > > + local tmp=`mktemp -u` > > + > > + _scratch_mkfs_xfs -N 2>/dev/null | perl -ne ' > > + if (/^meta-data=.*\s+agcount=(\d+), agsize=(\d+) blks/) { > > + print STDOUT "agcount=$1\nagsize=$2\n"; > > + } > > + if (/^data\s+=\s+bsize=(\d+)\s/) { > > + print STDOUT "dbsize=$1\n"; > > + }' > $tmp.mkfs > > + > > + . $tmp.mkfs > > + if [ -n "$agcount" -a -n "$agsize" -a -n "$dbsize" ];then > > + for ((i = 0; i < agcount; i++)); do > > + $XFS_IO_PROG -c "pwrite $((i * dbsize * agsize)) $dbsize" \ > > + $SCRATCH_DEV >/dev/null; > > + done > > + fi > > + rm -f $tmp.mkfs > > Add code as discussed above. > > --D > > > +} > > -- > > 2.20.1 > >
on 2019/09/19 23:00, Zorro Lang wrote: > xfs/030 always fails after d0e484ac699f ("check: wipe scratch devices > between tests") get merged. > > Due to xfs/030 does a sized(100m) mkfs. Before we merge above commit, > mkfs.xfs detects an old primary superblock, it will write zeroes to > all superblocks before formatting the new filesystem. But this won't > be done if we wipe the first superblock(by merging above commit). > > That means if we make a (smaller) sized xfs after wipefs, those *old* > superblocks which created by last time mkfs.xfs will be left on disk. > Then when we do xfs_repair, if xfs_repair can't find the first SB, it > will go to find those *old* SB at first. When it finds them, > everyting goes wrong. > > So I try to get XFS AG geometry(by default) and then try to erase all > superblocks. Thanks Darrick J. Wong helped to analyze this issue. Feel free to add Reported-by. > > Signed-off-by: Zorro Lang <zlang@redhat.com> > --- > common/rc | 4 ++++ > common/xfs | 23 +++++++++++++++++++++++ > 2 files changed, 27 insertions(+) > > diff --git a/common/rc b/common/rc > index 66c7fd4d..fe13f659 100644 > --- a/common/rc > +++ b/common/rc > @@ -4048,6 +4048,10 @@ _try_wipe_scratch_devs() > for dev in $SCRATCH_DEV_POOL $SCRATCH_DEV $SCRATCH_LOGDEV $SCRATCH_RTDEV; do > test -b $dev && $WIPEFS_PROG -a $dev > done > + > + if [ "$FSTYP" = "xfs" ];then > + try_wipe_scratch_xfs I think we should add a simple comment for why we add it. ps:_scratch_mkfs_xfs also can make case pass. We can use it and add comment. the try_wipe_scratch_xfs method and the _scratch_mkfs_xfs method are all acceptable for me. > + fi > } > > # Only run this on xfs if xfs_scrub is available and has the unicode checker > diff --git a/common/xfs b/common/xfs > index 1bce3c18..34516f82 100644 > --- a/common/xfs > +++ b/common/xfs > @@ -884,3 +884,26 @@ _xfs_mount_agcount() > { > $XFS_INFO_PROG "$1" | grep agcount= | sed -e 's/^.*agcount=\([0-9]*\),.*$/\1/g' > } > + > +# wipe the superblock of each XFS AGs > +try_wipe_scratch_xfs() > +{ > + local tmp=`mktemp -u` > + > + _scratch_mkfs_xfs -N 2>/dev/null | perl -ne ' > + if (/^meta-data=.*\s+agcount=(\d+), agsize=(\d+) blks/) { > + print STDOUT "agcount=$1\nagsize=$2\n"; > + } > + if (/^data\s+=\s+bsize=(\d+)\s/) { > + print STDOUT "dbsize=$1\n"; > + }' > $tmp.mkfs > + > + . $tmp.mkfs > + if [ -n "$agcount" -a -n "$agsize" -a -n "$dbsize" ];then > + for ((i = 0; i < agcount; i++)); do > + $XFS_IO_PROG -c "pwrite $((i * dbsize * agsize)) $dbsize" \ > + $SCRATCH_DEV >/dev/null; > + done > + fi > + rm -f $tmp.mkfs > +} >
On Fri, Sep 20, 2019 at 09:52:11AM +0800, Yang Xu wrote: > > > on 2019/09/19 23:00, Zorro Lang wrote: > > xfs/030 always fails after d0e484ac699f ("check: wipe scratch devices > > between tests") get merged. > > > > Due to xfs/030 does a sized(100m) mkfs. Before we merge above commit, > > mkfs.xfs detects an old primary superblock, it will write zeroes to > > all superblocks before formatting the new filesystem. But this won't > > be done if we wipe the first superblock(by merging above commit). > > > > That means if we make a (smaller) sized xfs after wipefs, those *old* > > superblocks which created by last time mkfs.xfs will be left on disk. > > Then when we do xfs_repair, if xfs_repair can't find the first SB, it > > will go to find those *old* SB at first. When it finds them, > > everyting goes wrong. > > > > So I try to get XFS AG geometry(by default) and then try to erase all > > superblocks. Thanks Darrick J. Wong helped to analyze this issue. > Feel free to add Reported-by. > > > > Signed-off-by: Zorro Lang <zlang@redhat.com> > > --- > > common/rc | 4 ++++ > > common/xfs | 23 +++++++++++++++++++++++ > > 2 files changed, 27 insertions(+) > > > > diff --git a/common/rc b/common/rc > > index 66c7fd4d..fe13f659 100644 > > --- a/common/rc > > +++ b/common/rc > > @@ -4048,6 +4048,10 @@ _try_wipe_scratch_devs() > > for dev in $SCRATCH_DEV_POOL $SCRATCH_DEV $SCRATCH_LOGDEV $SCRATCH_RTDEV; do > > test -b $dev && $WIPEFS_PROG -a $dev > > done > > + > > + if [ "$FSTYP" = "xfs" ];then > > + try_wipe_scratch_xfs > I think we should add a simple comment for why we add it. > > ps:_scratch_mkfs_xfs also can make case pass. We can use it and add comment. > the try_wipe_scratch_xfs method and the _scratch_mkfs_xfs method are all > acceptable for me. Yes, I suppose formatting and then wiping per below would also achieve our means, but it would come at the extra cost of zeroing the log. I'm not too eager to increase xfstest runtime even more. Hmmm, I wonder if xfs_db could just grow a 'wipe all superblocks' command.... --D > > + fi > > } > > # Only run this on xfs if xfs_scrub is available and has the unicode checker > > diff --git a/common/xfs b/common/xfs > > index 1bce3c18..34516f82 100644 > > --- a/common/xfs > > +++ b/common/xfs > > @@ -884,3 +884,26 @@ _xfs_mount_agcount() > > { > > $XFS_INFO_PROG "$1" | grep agcount= | sed -e 's/^.*agcount=\([0-9]*\),.*$/\1/g' > > } > > + > > +# wipe the superblock of each XFS AGs > > +try_wipe_scratch_xfs() > > +{ > > + local tmp=`mktemp -u` > > + > > + _scratch_mkfs_xfs -N 2>/dev/null | perl -ne ' > > + if (/^meta-data=.*\s+agcount=(\d+), agsize=(\d+) blks/) { > > + print STDOUT "agcount=$1\nagsize=$2\n"; > > + } > > + if (/^data\s+=\s+bsize=(\d+)\s/) { > > + print STDOUT "dbsize=$1\n"; > > + }' > $tmp.mkfs > > + > > + . $tmp.mkfs > > + if [ -n "$agcount" -a -n "$agsize" -a -n "$dbsize" ];then > > + for ((i = 0; i < agcount; i++)); do > > + $XFS_IO_PROG -c "pwrite $((i * dbsize * agsize)) $dbsize" \ > > + $SCRATCH_DEV >/dev/null; > > + done > > + fi > > + rm -f $tmp.mkfs > > +} > > > >
on 2019/09/20 10:48, Darrick J. Wong wrote: > On Fri, Sep 20, 2019 at 09:52:11AM +0800, Yang Xu wrote: >> >> >> on 2019/09/19 23:00, Zorro Lang wrote: >>> xfs/030 always fails after d0e484ac699f ("check: wipe scratch devices >>> between tests") get merged. >>> >>> Due to xfs/030 does a sized(100m) mkfs. Before we merge above commit, >>> mkfs.xfs detects an old primary superblock, it will write zeroes to >>> all superblocks before formatting the new filesystem. But this won't >>> be done if we wipe the first superblock(by merging above commit). >>> >>> That means if we make a (smaller) sized xfs after wipefs, those *old* >>> superblocks which created by last time mkfs.xfs will be left on disk. >>> Then when we do xfs_repair, if xfs_repair can't find the first SB, it >>> will go to find those *old* SB at first. When it finds them, >>> everyting goes wrong. >>> >>> So I try to get XFS AG geometry(by default) and then try to erase all >>> superblocks. Thanks Darrick J. Wong helped to analyze this issue. >> Feel free to add Reported-by. >>> >>> Signed-off-by: Zorro Lang <zlang@redhat.com> >>> --- >>> common/rc | 4 ++++ >>> common/xfs | 23 +++++++++++++++++++++++ >>> 2 files changed, 27 insertions(+) >>> >>> diff --git a/common/rc b/common/rc >>> index 66c7fd4d..fe13f659 100644 >>> --- a/common/rc >>> +++ b/common/rc >>> @@ -4048,6 +4048,10 @@ _try_wipe_scratch_devs() >>> for dev in $SCRATCH_DEV_POOL $SCRATCH_DEV $SCRATCH_LOGDEV $SCRATCH_RTDEV; do >>> test -b $dev && $WIPEFS_PROG -a $dev >>> done >>> + >>> + if [ "$FSTYP" = "xfs" ];then >>> + try_wipe_scratch_xfs >> I think we should add a simple comment for why we add it. >> >> ps:_scratch_mkfs_xfs also can make case pass. We can use it and add comment. >> the try_wipe_scratch_xfs method and the _scratch_mkfs_xfs method are all >> acceptable for me. > > Yes, I suppose formatting and then wiping per below would also achieve > our means, but it would come at the extra cost of zeroing the log. I'm > not too eager to increase xfstest runtime even more. > I see. Thanks. > Hmmm, I wonder if xfs_db could just grow a 'wipe all superblocks' > command.... Good idea.> > --D > >>> + fi >>> } >>> # Only run this on xfs if xfs_scrub is available and has the unicode checker >>> diff --git a/common/xfs b/common/xfs >>> index 1bce3c18..34516f82 100644 >>> --- a/common/xfs >>> +++ b/common/xfs >>> @@ -884,3 +884,26 @@ _xfs_mount_agcount() >>> { >>> $XFS_INFO_PROG "$1" | grep agcount= | sed -e 's/^.*agcount=\([0-9]*\),.*$/\1/g' >>> } >>> + >>> +# wipe the superblock of each XFS AGs >>> +try_wipe_scratch_xfs() >>> +{ >>> + local tmp=`mktemp -u` >>> + >>> + _scratch_mkfs_xfs -N 2>/dev/null | perl -ne ' >>> + if (/^meta-data=.*\s+agcount=(\d+), agsize=(\d+) blks/) { >>> + print STDOUT "agcount=$1\nagsize=$2\n"; >>> + } >>> + if (/^data\s+=\s+bsize=(\d+)\s/) { >>> + print STDOUT "dbsize=$1\n"; >>> + }' > $tmp.mkfs >>> + >>> + . $tmp.mkfs >>> + if [ -n "$agcount" -a -n "$agsize" -a -n "$dbsize" ];then >>> + for ((i = 0; i < agcount; i++)); do >>> + $XFS_IO_PROG -c "pwrite $((i * dbsize * agsize)) $dbsize" \ >>> + $SCRATCH_DEV >/dev/null; >>> + done >>> + fi >>> + rm -f $tmp.mkfs >>> +} >>> >> >> > >
On Fri, Sep 20, 2019 at 12:31:39PM +0800, Zorro Lang wrote: > On Thu, Sep 19, 2019 at 07:48:36PM -0700, Darrick J. Wong wrote: > > On Fri, Sep 20, 2019 at 09:52:11AM +0800, Yang Xu wrote: > > > > > > > > > on 2019/09/19 23:00, Zorro Lang wrote: > > > > xfs/030 always fails after d0e484ac699f ("check: wipe scratch devices > > > > between tests") get merged. > > > > > > > > Due to xfs/030 does a sized(100m) mkfs. Before we merge above commit, > > > > mkfs.xfs detects an old primary superblock, it will write zeroes to > > > > all superblocks before formatting the new filesystem. But this won't > > > > be done if we wipe the first superblock(by merging above commit). > > > > > > > > That means if we make a (smaller) sized xfs after wipefs, those *old* > > > > superblocks which created by last time mkfs.xfs will be left on disk. > > > > Then when we do xfs_repair, if xfs_repair can't find the first SB, it > > > > will go to find those *old* SB at first. When it finds them, > > > > everyting goes wrong. > > > > > > > > So I try to get XFS AG geometry(by default) and then try to erase all > > > > superblocks. Thanks Darrick J. Wong helped to analyze this issue. > > > Feel free to add Reported-by. > > > > > > > > Signed-off-by: Zorro Lang <zlang@redhat.com> > > > > --- > > > > common/rc | 4 ++++ > > > > common/xfs | 23 +++++++++++++++++++++++ > > > > 2 files changed, 27 insertions(+) > > > > > > > > diff --git a/common/rc b/common/rc > > > > index 66c7fd4d..fe13f659 100644 > > > > --- a/common/rc > > > > +++ b/common/rc > > > > @@ -4048,6 +4048,10 @@ _try_wipe_scratch_devs() > > > > for dev in $SCRATCH_DEV_POOL $SCRATCH_DEV $SCRATCH_LOGDEV $SCRATCH_RTDEV; do > > > > test -b $dev && $WIPEFS_PROG -a $dev > > > > done > > > > + > > > > + if [ "$FSTYP" = "xfs" ];then > > > > + try_wipe_scratch_xfs > > > I think we should add a simple comment for why we add it. > > > > > > ps:_scratch_mkfs_xfs also can make case pass. We can use it and add comment. > > > the try_wipe_scratch_xfs method and the _scratch_mkfs_xfs method are all > > > acceptable for me. > > > > Yes, I suppose formatting and then wiping per below would also achieve > > our means, but it would come at the extra cost of zeroing the log. I'm > > not too eager to increase xfstest runtime even more. > > > > Hmmm, I wonder if xfs_db could just grow a 'wipe all superblocks' > > command.... > > Haha, I was thinking about that too, and I tried this: > -- > agc=`_scratch_xfs_get_sb_field agcount` > wipe_xfs_cmd="$XFS_DB_PROG -x" > for ((i=0; i<agc; i++)); do > wipe_xfs_cmd="$wipe_xfs_cmd -c \"sb $i\" -c \"write -c magicnum 0x00000000\"" > done > wipe_xfs_cmd="$wipe_xfs_cmd $SCRATCH_DEV" > eval $wipe_xfs_cmd > -- > > The only one problem about this, I think it's the max length of bash command:) Yeah... I mean, the downside of all this is that a filesystme could have thousands of AGs, though I don't imagine there are many people who set up a 1PB array just to run xfstests ;) --D > Thanks, > Zorro > > > > > --D > > > > > > + fi > > > > } > > > > # Only run this on xfs if xfs_scrub is available and has the unicode checker > > > > diff --git a/common/xfs b/common/xfs > > > > index 1bce3c18..34516f82 100644 > > > > --- a/common/xfs > > > > +++ b/common/xfs > > > > @@ -884,3 +884,26 @@ _xfs_mount_agcount() > > > > { > > > > $XFS_INFO_PROG "$1" | grep agcount= | sed -e 's/^.*agcount=\([0-9]*\),.*$/\1/g' > > > > } > > > > + > > > > +# wipe the superblock of each XFS AGs > > > > +try_wipe_scratch_xfs() > > > > +{ > > > > + local tmp=`mktemp -u` > > > > + > > > > + _scratch_mkfs_xfs -N 2>/dev/null | perl -ne ' > > > > + if (/^meta-data=.*\s+agcount=(\d+), agsize=(\d+) blks/) { > > > > + print STDOUT "agcount=$1\nagsize=$2\n"; > > > > + } > > > > + if (/^data\s+=\s+bsize=(\d+)\s/) { > > > > + print STDOUT "dbsize=$1\n"; > > > > + }' > $tmp.mkfs > > > > + > > > > + . $tmp.mkfs > > > > + if [ -n "$agcount" -a -n "$agsize" -a -n "$dbsize" ];then > > > > + for ((i = 0; i < agcount; i++)); do > > > > + $XFS_IO_PROG -c "pwrite $((i * dbsize * agsize)) $dbsize" \ > > > > + $SCRATCH_DEV >/dev/null; > > > > + done > > > > + fi > > > > + rm -f $tmp.mkfs > > > > +} > > > > > > > > > >
On Thu, Sep 19, 2019 at 07:48:36PM -0700, Darrick J. Wong wrote: > On Fri, Sep 20, 2019 at 09:52:11AM +0800, Yang Xu wrote: > > > > > > on 2019/09/19 23:00, Zorro Lang wrote: > > > xfs/030 always fails after d0e484ac699f ("check: wipe scratch devices > > > between tests") get merged. > > > > > > Due to xfs/030 does a sized(100m) mkfs. Before we merge above commit, > > > mkfs.xfs detects an old primary superblock, it will write zeroes to > > > all superblocks before formatting the new filesystem. But this won't > > > be done if we wipe the first superblock(by merging above commit). > > > > > > That means if we make a (smaller) sized xfs after wipefs, those *old* > > > superblocks which created by last time mkfs.xfs will be left on disk. > > > Then when we do xfs_repair, if xfs_repair can't find the first SB, it > > > will go to find those *old* SB at first. When it finds them, > > > everyting goes wrong. > > > > > > So I try to get XFS AG geometry(by default) and then try to erase all > > > superblocks. Thanks Darrick J. Wong helped to analyze this issue. > > Feel free to add Reported-by. > > > > > > Signed-off-by: Zorro Lang <zlang@redhat.com> > > > --- > > > common/rc | 4 ++++ > > > common/xfs | 23 +++++++++++++++++++++++ > > > 2 files changed, 27 insertions(+) > > > > > > diff --git a/common/rc b/common/rc > > > index 66c7fd4d..fe13f659 100644 > > > --- a/common/rc > > > +++ b/common/rc > > > @@ -4048,6 +4048,10 @@ _try_wipe_scratch_devs() > > > for dev in $SCRATCH_DEV_POOL $SCRATCH_DEV $SCRATCH_LOGDEV $SCRATCH_RTDEV; do > > > test -b $dev && $WIPEFS_PROG -a $dev > > > done > > > + > > > + if [ "$FSTYP" = "xfs" ];then > > > + try_wipe_scratch_xfs > > I think we should add a simple comment for why we add it. > > > > ps:_scratch_mkfs_xfs also can make case pass. We can use it and add comment. > > the try_wipe_scratch_xfs method and the _scratch_mkfs_xfs method are all > > acceptable for me. > > Yes, I suppose formatting and then wiping per below would also achieve > our means, but it would come at the extra cost of zeroing the log. I'm > not too eager to increase xfstest runtime even more. > > Hmmm, I wonder if xfs_db could just grow a 'wipe all superblocks' > command.... Haha, I was thinking about that too, and I tried this: -- agc=`_scratch_xfs_get_sb_field agcount` wipe_xfs_cmd="$XFS_DB_PROG -x" for ((i=0; i<agc; i++)); do wipe_xfs_cmd="$wipe_xfs_cmd -c \"sb $i\" -c \"write -c magicnum 0x00000000\"" done wipe_xfs_cmd="$wipe_xfs_cmd $SCRATCH_DEV" eval $wipe_xfs_cmd -- The only one problem about this, I think it's the max length of bash command:) Thanks, Zorro > > --D > > > > + fi > > > } > > > # Only run this on xfs if xfs_scrub is available and has the unicode checker > > > diff --git a/common/xfs b/common/xfs > > > index 1bce3c18..34516f82 100644 > > > --- a/common/xfs > > > +++ b/common/xfs > > > @@ -884,3 +884,26 @@ _xfs_mount_agcount() > > > { > > > $XFS_INFO_PROG "$1" | grep agcount= | sed -e 's/^.*agcount=\([0-9]*\),.*$/\1/g' > > > } > > > + > > > +# wipe the superblock of each XFS AGs > > > +try_wipe_scratch_xfs() > > > +{ > > > + local tmp=`mktemp -u` > > > + > > > + _scratch_mkfs_xfs -N 2>/dev/null | perl -ne ' > > > + if (/^meta-data=.*\s+agcount=(\d+), agsize=(\d+) blks/) { > > > + print STDOUT "agcount=$1\nagsize=$2\n"; > > > + } > > > + if (/^data\s+=\s+bsize=(\d+)\s/) { > > > + print STDOUT "dbsize=$1\n"; > > > + }' > $tmp.mkfs > > > + > > > + . $tmp.mkfs > > > + if [ -n "$agcount" -a -n "$agsize" -a -n "$dbsize" ];then > > > + for ((i = 0; i < agcount; i++)); do > > > + $XFS_IO_PROG -c "pwrite $((i * dbsize * agsize)) $dbsize" \ > > > + $SCRATCH_DEV >/dev/null; > > > + done > > > + fi > > > + rm -f $tmp.mkfs > > > +} > > > > > > >
diff --git a/common/rc b/common/rc index 66c7fd4d..fe13f659 100644 --- a/common/rc +++ b/common/rc @@ -4048,6 +4048,10 @@ _try_wipe_scratch_devs() for dev in $SCRATCH_DEV_POOL $SCRATCH_DEV $SCRATCH_LOGDEV $SCRATCH_RTDEV; do test -b $dev && $WIPEFS_PROG -a $dev done + + if [ "$FSTYP" = "xfs" ];then + try_wipe_scratch_xfs + fi } # Only run this on xfs if xfs_scrub is available and has the unicode checker diff --git a/common/xfs b/common/xfs index 1bce3c18..34516f82 100644 --- a/common/xfs +++ b/common/xfs @@ -884,3 +884,26 @@ _xfs_mount_agcount() { $XFS_INFO_PROG "$1" | grep agcount= | sed -e 's/^.*agcount=\([0-9]*\),.*$/\1/g' } + +# wipe the superblock of each XFS AGs +try_wipe_scratch_xfs() +{ + local tmp=`mktemp -u` + + _scratch_mkfs_xfs -N 2>/dev/null | perl -ne ' + if (/^meta-data=.*\s+agcount=(\d+), agsize=(\d+) blks/) { + print STDOUT "agcount=$1\nagsize=$2\n"; + } + if (/^data\s+=\s+bsize=(\d+)\s/) { + print STDOUT "dbsize=$1\n"; + }' > $tmp.mkfs + + . $tmp.mkfs + if [ -n "$agcount" -a -n "$agsize" -a -n "$dbsize" ];then + for ((i = 0; i < agcount; i++)); do + $XFS_IO_PROG -c "pwrite $((i * dbsize * agsize)) $dbsize" \ + $SCRATCH_DEV >/dev/null; + done + fi + rm -f $tmp.mkfs +}
xfs/030 always fails after d0e484ac699f ("check: wipe scratch devices between tests") get merged. Due to xfs/030 does a sized(100m) mkfs. Before we merge above commit, mkfs.xfs detects an old primary superblock, it will write zeroes to all superblocks before formatting the new filesystem. But this won't be done if we wipe the first superblock(by merging above commit). That means if we make a (smaller) sized xfs after wipefs, those *old* superblocks which created by last time mkfs.xfs will be left on disk. Then when we do xfs_repair, if xfs_repair can't find the first SB, it will go to find those *old* SB at first. When it finds them, everyting goes wrong. So I try to get XFS AG geometry(by default) and then try to erase all superblocks. Thanks Darrick J. Wong helped to analyze this issue. Signed-off-by: Zorro Lang <zlang@redhat.com> --- common/rc | 4 ++++ common/xfs | 23 +++++++++++++++++++++++ 2 files changed, 27 insertions(+)