Message ID | 162561728893.543423.5093723938379703860.stgit@locust (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | fstests: random fixes | expand |
On 7/6/21 5:21 PM, Darrick J. Wong wrote: > From: Darrick J. Wong <djwong@kernel.org> > > Every now and then I see the following failure when running generic/347: > > --- generic/347.out > +++ generic/347.out.bad > @@ -1,2 +1,2 @@ > QA output created by 347 > -=== completed > +failed to create dm thin pool device > > Accompanied by the following dmesg spew: > > device-mapper: thin metadata: sb_check failed: blocknr 7016996765293437281: wanted 0 > device-mapper: block manager: superblock validator check failed for block 0 > device-mapper: thin metadata: couldn't read superblock > device-mapper: table: 253:2: thin-pool: Error creating metadata object > device-mapper: ioctl: error adding target to table > > 7016996765293437281 is of course the magic number 0x6161616161616161, > which are stale ondisk contents left behind by previous tests that wrote > known tests patterns to files on the scratch device. This is a bit > surprising, since _dmthin_init supposedly zeroes the first 4k of the > thin pool metadata device before initializing the pool. Or does it? > > dd if=/dev/zero of=$DMTHIN_META_DEV bs=4096 count=1 &>/dev/null > > Herein lies the problem: the dd process writes zeroes into the page > cache and exits. Normally the block layer will flush the page cache > after the last file descriptor is closed, but once in a while the > terminating dd process won't be the only process in the system with an > open file descriptor! > > That process is of course udev. The write() call from dd triggers a > kernel uevent, which starts udev. If udev is running particularly > slowly, it'll still be running an instant later when dd terminates, > thereby preventing the page cache flush. If udev is still running a > moment later when we call dmsetup to set up the thin pool, the pool > creation will issue a bio to read the ondisk superblock. This read > isn't coherent with the page cache, so it sees old disk contents and the > test fails even though we supposedly formatted the metadata device. > > Fix this by explicitly flushing the page cache after writing the zeroes. > > Fixes: 4b52fffb ("dm-thinp helpers in common/dmthin") > Signed-off-by: Darrick J. Wong <djwong@kernel.org> Ok, makes sense. Good catch! Reviewed-by: Allison Henderson <allison.henderson@oracle.com> > --- > common/dmthin | 8 ++++++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > > diff --git a/common/dmthin b/common/dmthin > index 3b1c7d45..91147e47 100644 > --- a/common/dmthin > +++ b/common/dmthin > @@ -113,8 +113,12 @@ _dmthin_init() > _dmsetup_create $DMTHIN_DATA_NAME --table "$DMTHIN_DATA_TABLE" || \ > _fatal "failed to create dm thin data device" > > - # Zap the pool metadata dev > - dd if=/dev/zero of=$DMTHIN_META_DEV bs=4096 count=1 &>/dev/null > + # Zap the pool metadata dev. Explicitly fsync the zeroes to disk > + # because a slow-running udev running concurrently with dd can maintain > + # an open file descriptor. The block layer only flushes the page cache > + # on last close, which means that the thin pool creation below will > + # see the (stale) ondisk contents and fail. > + dd if=/dev/zero of=$DMTHIN_META_DEV bs=4096 count=1 conv=fsync &>/dev/null > > # Thin pool > # "start length thin-pool metadata_dev data_dev data_block_size low_water_mark" >
On Tue, Jul 06, 2021 at 05:21:28PM -0700, Darrick J. Wong wrote: > From: Darrick J. Wong <djwong@kernel.org> > > Every now and then I see the following failure when running generic/347: > > --- generic/347.out > +++ generic/347.out.bad > @@ -1,2 +1,2 @@ > QA output created by 347 > -=== completed > +failed to create dm thin pool device > > Accompanied by the following dmesg spew: > > device-mapper: thin metadata: sb_check failed: blocknr 7016996765293437281: wanted 0 > device-mapper: block manager: superblock validator check failed for block 0 > device-mapper: thin metadata: couldn't read superblock > device-mapper: table: 253:2: thin-pool: Error creating metadata object > device-mapper: ioctl: error adding target to table > > 7016996765293437281 is of course the magic number 0x6161616161616161, > which are stale ondisk contents left behind by previous tests that wrote > known tests patterns to files on the scratch device. This is a bit > surprising, since _dmthin_init supposedly zeroes the first 4k of the > thin pool metadata device before initializing the pool. Or does it? > > dd if=/dev/zero of=$DMTHIN_META_DEV bs=4096 count=1 &>/dev/null > > Herein lies the problem: the dd process writes zeroes into the page > cache and exits. Normally the block layer will flush the page cache > after the last file descriptor is closed, but once in a while the > terminating dd process won't be the only process in the system with an > open file descriptor! > > That process is of course udev. The write() call from dd triggers a > kernel uevent, which starts udev. If udev is running particularly > slowly, it'll still be running an instant later when dd terminates, > thereby preventing the page cache flush. If udev is still running a > moment later when we call dmsetup to set up the thin pool, the pool > creation will issue a bio to read the ondisk superblock. This read > isn't coherent with the page cache, so it sees old disk contents and the > test fails even though we supposedly formatted the metadata device. > > Fix this by explicitly flushing the page cache after writing the zeroes. > > Fixes: 4b52fffb ("dm-thinp helpers in common/dmthin") > Signed-off-by: Darrick J. Wong <djwong@kernel.org> This is a hard one to catch, thanks for the patch! :) Eryu > --- > common/dmthin | 8 ++++++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > > diff --git a/common/dmthin b/common/dmthin > index 3b1c7d45..91147e47 100644 > --- a/common/dmthin > +++ b/common/dmthin > @@ -113,8 +113,12 @@ _dmthin_init() > _dmsetup_create $DMTHIN_DATA_NAME --table "$DMTHIN_DATA_TABLE" || \ > _fatal "failed to create dm thin data device" > > - # Zap the pool metadata dev > - dd if=/dev/zero of=$DMTHIN_META_DEV bs=4096 count=1 &>/dev/null > + # Zap the pool metadata dev. Explicitly fsync the zeroes to disk > + # because a slow-running udev running concurrently with dd can maintain > + # an open file descriptor. The block layer only flushes the page cache > + # on last close, which means that the thin pool creation below will > + # see the (stale) ondisk contents and fail. > + dd if=/dev/zero of=$DMTHIN_META_DEV bs=4096 count=1 conv=fsync &>/dev/null > > # Thin pool > # "start length thin-pool metadata_dev data_dev data_block_size low_water_mark"
On Tue, Jul 06, 2021 at 05:21:28PM -0700, Darrick J. Wong wrote: > From: Darrick J. Wong <djwong@kernel.org> > > Every now and then I see the following failure when running generic/347: > > --- generic/347.out > +++ generic/347.out.bad > @@ -1,2 +1,2 @@ > QA output created by 347 > -=== completed > +failed to create dm thin pool device Note that I have to add space indention to above diff lines, otherwise it seems git am will treat lines after "---" as the beginning of patch, and failed to apply. Thanks, Eryu > > Accompanied by the following dmesg spew: > > device-mapper: thin metadata: sb_check failed: blocknr 7016996765293437281: wanted 0 > device-mapper: block manager: superblock validator check failed for block 0 > device-mapper: thin metadata: couldn't read superblock > device-mapper: table: 253:2: thin-pool: Error creating metadata object > device-mapper: ioctl: error adding target to table > > 7016996765293437281 is of course the magic number 0x6161616161616161, > which are stale ondisk contents left behind by previous tests that wrote > known tests patterns to files on the scratch device. This is a bit > surprising, since _dmthin_init supposedly zeroes the first 4k of the > thin pool metadata device before initializing the pool. Or does it? > > dd if=/dev/zero of=$DMTHIN_META_DEV bs=4096 count=1 &>/dev/null > > Herein lies the problem: the dd process writes zeroes into the page > cache and exits. Normally the block layer will flush the page cache > after the last file descriptor is closed, but once in a while the > terminating dd process won't be the only process in the system with an > open file descriptor! > > That process is of course udev. The write() call from dd triggers a > kernel uevent, which starts udev. If udev is running particularly > slowly, it'll still be running an instant later when dd terminates, > thereby preventing the page cache flush. If udev is still running a > moment later when we call dmsetup to set up the thin pool, the pool > creation will issue a bio to read the ondisk superblock. This read > isn't coherent with the page cache, so it sees old disk contents and the > test fails even though we supposedly formatted the metadata device. > > Fix this by explicitly flushing the page cache after writing the zeroes. > > Fixes: 4b52fffb ("dm-thinp helpers in common/dmthin") > Signed-off-by: Darrick J. Wong <djwong@kernel.org> > --- > common/dmthin | 8 ++++++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > > diff --git a/common/dmthin b/common/dmthin > index 3b1c7d45..91147e47 100644 > --- a/common/dmthin > +++ b/common/dmthin > @@ -113,8 +113,12 @@ _dmthin_init() > _dmsetup_create $DMTHIN_DATA_NAME --table "$DMTHIN_DATA_TABLE" || \ > _fatal "failed to create dm thin data device" > > - # Zap the pool metadata dev > - dd if=/dev/zero of=$DMTHIN_META_DEV bs=4096 count=1 &>/dev/null > + # Zap the pool metadata dev. Explicitly fsync the zeroes to disk > + # because a slow-running udev running concurrently with dd can maintain > + # an open file descriptor. The block layer only flushes the page cache > + # on last close, which means that the thin pool creation below will > + # see the (stale) ondisk contents and fail. > + dd if=/dev/zero of=$DMTHIN_META_DEV bs=4096 count=1 conv=fsync &>/dev/null > > # Thin pool > # "start length thin-pool metadata_dev data_dev data_block_size low_water_mark"
--- generic/347.out +++ generic/347.out.bad @@ -1,2 +1,2 @@ QA output created by 347 -=== completed +failed to create dm thin pool device Accompanied by the following dmesg spew: device-mapper: thin metadata: sb_check failed: blocknr 7016996765293437281: wanted 0 device-mapper: block manager: superblock validator check failed for block 0 device-mapper: thin metadata: couldn't read superblock device-mapper: table: 253:2: thin-pool: Error creating metadata object device-mapper: ioctl: error adding target to table 7016996765293437281 is of course the magic number 0x6161616161616161, which are stale ondisk contents left behind by previous tests that wrote known tests patterns to files on the scratch device. This is a bit surprising, since _dmthin_init supposedly zeroes the first 4k of the thin pool metadata device before initializing the pool. Or does it? dd if=/dev/zero of=$DMTHIN_META_DEV bs=4096 count=1 &>/dev/null Herein lies the problem: the dd process writes zeroes into the page cache and exits. Normally the block layer will flush the page cache after the last file descriptor is closed, but once in a while the terminating dd process won't be the only process in the system with an open file descriptor! That process is of course udev. The write() call from dd triggers a kernel uevent, which starts udev. If udev is running particularly slowly, it'll still be running an instant later when dd terminates, thereby preventing the page cache flush. If udev is still running a moment later when we call dmsetup to set up the thin pool, the pool creation will issue a bio to read the ondisk superblock. This read isn't coherent with the page cache, so it sees old disk contents and the test fails even though we supposedly formatted the metadata device. Fix this by explicitly flushing the page cache after writing the zeroes. Fixes: 4b52fffb ("dm-thinp helpers in common/dmthin") Signed-off-by: Darrick J. Wong <djwong@kernel.org> --- common/dmthin | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/common/dmthin b/common/dmthin index 3b1c7d45..91147e47 100644 --- a/common/dmthin +++ b/common/dmthin @@ -113,8 +113,12 @@ _dmthin_init() _dmsetup_create $DMTHIN_DATA_NAME --table "$DMTHIN_DATA_TABLE" || \ _fatal "failed to create dm thin data device" - # Zap the pool metadata dev - dd if=/dev/zero of=$DMTHIN_META_DEV bs=4096 count=1 &>/dev/null + # Zap the pool metadata dev. Explicitly fsync the zeroes to disk + # because a slow-running udev running concurrently with dd can maintain + # an open file descriptor. The block layer only flushes the page cache + # on last close, which means that the thin pool creation below will + # see the (stale) ondisk contents and fail. + dd if=/dev/zero of=$DMTHIN_META_DEV bs=4096 count=1 conv=fsync &>/dev/null # Thin pool # "start length thin-pool metadata_dev data_dev data_block_size low_water_mark"
From: Darrick J. Wong <djwong@kernel.org> Every now and then I see the following failure when running generic/347: