Message ID | 20171113175724.7302-1-bo.li.liu@oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, Nov 13, 2017 at 10:57:24AM -0700, Liu Bo wrote: > Xfstests btrfs/146 revealed this corruption, > > [ 58.138831] Buffer I/O error on dev dm-0, logical block 2621424, async page read > [ 58.151233] BTRFS error (device sdf): bdev /dev/mapper/error-test errs: wr 1, rd 0, flush 0, corrupt 0, gen 0 > [ 58.152403] list_add corruption. prev->next should be next (ffff88005e6775d8), but was ffffc9000189be88. (prev=ffffc9000189be88). > [ 58.153518] ------------[ cut here ]------------ > [ 58.153892] WARNING: CPU: 1 PID: 1287 at lib/list_debug.c:31 __list_add_valid+0x169/0x1f0 > ... > [ 58.157379] RIP: 0010:__list_add_valid+0x169/0x1f0 > ... > [ 58.161956] Call Trace: > [ 58.162264] btrfs_log_inode_parent+0x5bd/0xfb0 [btrfs] > [ 58.163583] btrfs_log_dentry_safe+0x60/0x80 [btrfs] > [ 58.164003] btrfs_sync_file+0x4c2/0x6f0 [btrfs] > [ 58.164393] vfs_fsync_range+0x5f/0xd0 > [ 58.164898] do_fsync+0x5a/0x90 > [ 58.165170] SyS_fsync+0x10/0x20 > [ 58.165395] entry_SYSCALL_64_fastpath+0x1f/0xbe > ... > > It turns out that we could record btrfs_log_ctx:io_err in > log_one_extents when IO fails, but make log_one_extents() return '0' > instead of -EIO, so the IO error is not acknowledged by the callers, > i.e. btrfs_log_inode_parent(), which would remove btrfs_log_ctx:list > from list head 'root->log_ctxs'. Since btrfs_log_ctx is allocated > from stack memory, it'd get freed with a object alive on the > list. then a future list_add will throw the above warning. > > This returns the correct error in the above case. > > Jeff also reported this while testing against his fsync error > patch set[1]. > > [1]: https://www.spinics.net/lists/linux-btrfs/msg65308.html > "btrfs list corruption and soft lockups while testing writeback error handling" > Fixes: 8407f553268a4611f254 ("Btrfs: fix data corruption after fast fsync and writeback error") you can also add the commit author to CC. > Signed-off-by: Liu Bo <bo.li.liu@oracle.com> > --- > fs/btrfs/file.c | 1 + > fs/btrfs/tree-log.c | 2 +- > 2 files changed, 2 insertions(+), 1 deletion(-) > > diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c > index 063180b..db70eaa 100644 > --- a/fs/btrfs/file.c > +++ b/fs/btrfs/file.c > @@ -2241,6 +2241,7 @@ int btrfs_sync_file(struct file *file, loff_t start, loff_t end, int datasync) > if (ctx.io_err) { > btrfs_end_transaction(trans); > ret = ctx.io_err; > + ASSERT(list_empty(&ctx.list)); Please move that to the label 'out', so all exit paths can catch the problem. > goto out; > } > > diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c > index c800d06..d300284 100644 > --- a/fs/btrfs/tree-log.c > +++ b/fs/btrfs/tree-log.c > @@ -4100,7 +4100,7 @@ static int log_one_extent(struct btrfs_trans_handle *trans, > > if (ordered_io_err) { > ctx->io_err = -EIO; > - return 0; > + return ctx->io_err; > } > > btrfs_init_map_token(&token); > -- > 2.9.4 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Nov 20, 2017 at 06:34:34PM +0100, David Sterba wrote: > On Mon, Nov 13, 2017 at 10:57:24AM -0700, Liu Bo wrote: > > Xfstests btrfs/146 revealed this corruption, > > > > [ 58.138831] Buffer I/O error on dev dm-0, logical block 2621424, async page read > > [ 58.151233] BTRFS error (device sdf): bdev /dev/mapper/error-test errs: wr 1, rd 0, flush 0, corrupt 0, gen 0 > > [ 58.152403] list_add corruption. prev->next should be next (ffff88005e6775d8), but was ffffc9000189be88. (prev=ffffc9000189be88). > > [ 58.153518] ------------[ cut here ]------------ > > [ 58.153892] WARNING: CPU: 1 PID: 1287 at lib/list_debug.c:31 __list_add_valid+0x169/0x1f0 > > ... > > [ 58.157379] RIP: 0010:__list_add_valid+0x169/0x1f0 > > ... > > [ 58.161956] Call Trace: > > [ 58.162264] btrfs_log_inode_parent+0x5bd/0xfb0 [btrfs] > > [ 58.163583] btrfs_log_dentry_safe+0x60/0x80 [btrfs] > > [ 58.164003] btrfs_sync_file+0x4c2/0x6f0 [btrfs] > > [ 58.164393] vfs_fsync_range+0x5f/0xd0 > > [ 58.164898] do_fsync+0x5a/0x90 > > [ 58.165170] SyS_fsync+0x10/0x20 > > [ 58.165395] entry_SYSCALL_64_fastpath+0x1f/0xbe > > ... > > > > It turns out that we could record btrfs_log_ctx:io_err in > > log_one_extents when IO fails, but make log_one_extents() return '0' > > instead of -EIO, so the IO error is not acknowledged by the callers, > > i.e. btrfs_log_inode_parent(), which would remove btrfs_log_ctx:list > > from list head 'root->log_ctxs'. Since btrfs_log_ctx is allocated > > from stack memory, it'd get freed with a object alive on the > > list. then a future list_add will throw the above warning. > > > > This returns the correct error in the above case. > > > > Jeff also reported this while testing against his fsync error > > patch set[1]. > > > > [1]: https://www.spinics.net/lists/linux-btrfs/msg65308.html > > "btrfs list corruption and soft lockups while testing writeback error handling" > > > > Fixes: 8407f553268a4611f254 ("Btrfs: fix data corruption after fast fsync and writeback error") > > you can also add the commit author to CC. Hmm, somehow I thought ctx->list hasn't been added at that time, but looks like you're right. > > > Signed-off-by: Liu Bo <bo.li.liu@oracle.com> > > --- > > fs/btrfs/file.c | 1 + > > fs/btrfs/tree-log.c | 2 +- > > 2 files changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c > > index 063180b..db70eaa 100644 > > --- a/fs/btrfs/file.c > > +++ b/fs/btrfs/file.c > > @@ -2241,6 +2241,7 @@ int btrfs_sync_file(struct file *file, loff_t start, loff_t end, int datasync) > > if (ctx.io_err) { > > btrfs_end_transaction(trans); > > ret = ctx.io_err; > > + ASSERT(list_empty(&ctx.list)); > > Please move that to the label 'out', so all exit paths can catch the > problem. > 'out' can't be used as we can goto 'out' before ctx is initialized, I'll add a new label 'out_ctx' in a separate patch (Feel free to fold them into one if you prefer). Thanks, -liubo > > goto out; > > } > > > > diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c > > index c800d06..d300284 100644 > > --- a/fs/btrfs/tree-log.c > > +++ b/fs/btrfs/tree-log.c > > @@ -4100,7 +4100,7 @@ static int log_one_extent(struct btrfs_trans_handle *trans, > > > > if (ordered_io_err) { > > ctx->io_err = -EIO; > > - return 0; > > + return ctx->io_err; > > } > > > > btrfs_init_map_token(&token); > > -- > > 2.9.4 > > > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Nov 20, 2017 at 10:37:25AM -0700, Liu Bo wrote: > > > [1]: https://www.spinics.net/lists/linux-btrfs/msg65308.html > > > "btrfs list corruption and soft lockups while testing writeback error handling" > > > > > > > Fixes: 8407f553268a4611f254 ("Btrfs: fix data corruption after fast fsync and writeback error") > > > > you can also add the commit author to CC. > > Hmm, somehow I thought ctx->list hasn't been added at that time, but > looks like you're right. The commit added + if (ordered_io_err) { + ctx->io_err = -EIO; + return 0; + } + which you fix. > > > --- a/fs/btrfs/file.c > > > +++ b/fs/btrfs/file.c > > > @@ -2241,6 +2241,7 @@ int btrfs_sync_file(struct file *file, loff_t start, loff_t end, int datasync) > > > if (ctx.io_err) { > > > btrfs_end_transaction(trans); > > > ret = ctx.io_err; > > > + ASSERT(list_empty(&ctx.list)); > > > > Please move that to the label 'out', so all exit paths can catch the > > problem. > > > > 'out' can't be used as we can goto 'out' before ctx is initialized, > I'll add a new label 'out_ctx' in a separate patch (Feel free to fold > them into one if you prefer). Would be better to initialize ctx->list at the beginning and then always do the list_empty check, instead of selectively jumping to out or out_ctx. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 063180b..db70eaa 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -2241,6 +2241,7 @@ int btrfs_sync_file(struct file *file, loff_t start, loff_t end, int datasync) if (ctx.io_err) { btrfs_end_transaction(trans); ret = ctx.io_err; + ASSERT(list_empty(&ctx.list)); goto out; } diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c index c800d06..d300284 100644 --- a/fs/btrfs/tree-log.c +++ b/fs/btrfs/tree-log.c @@ -4100,7 +4100,7 @@ static int log_one_extent(struct btrfs_trans_handle *trans, if (ordered_io_err) { ctx->io_err = -EIO; - return 0; + return ctx->io_err; } btrfs_init_map_token(&token);
Xfstests btrfs/146 revealed this corruption, [ 58.138831] Buffer I/O error on dev dm-0, logical block 2621424, async page read [ 58.151233] BTRFS error (device sdf): bdev /dev/mapper/error-test errs: wr 1, rd 0, flush 0, corrupt 0, gen 0 [ 58.152403] list_add corruption. prev->next should be next (ffff88005e6775d8), but was ffffc9000189be88. (prev=ffffc9000189be88). [ 58.153518] ------------[ cut here ]------------ [ 58.153892] WARNING: CPU: 1 PID: 1287 at lib/list_debug.c:31 __list_add_valid+0x169/0x1f0 ... [ 58.157379] RIP: 0010:__list_add_valid+0x169/0x1f0 ... [ 58.161956] Call Trace: [ 58.162264] btrfs_log_inode_parent+0x5bd/0xfb0 [btrfs] [ 58.163583] btrfs_log_dentry_safe+0x60/0x80 [btrfs] [ 58.164003] btrfs_sync_file+0x4c2/0x6f0 [btrfs] [ 58.164393] vfs_fsync_range+0x5f/0xd0 [ 58.164898] do_fsync+0x5a/0x90 [ 58.165170] SyS_fsync+0x10/0x20 [ 58.165395] entry_SYSCALL_64_fastpath+0x1f/0xbe ... It turns out that we could record btrfs_log_ctx:io_err in log_one_extents when IO fails, but make log_one_extents() return '0' instead of -EIO, so the IO error is not acknowledged by the callers, i.e. btrfs_log_inode_parent(), which would remove btrfs_log_ctx:list from list head 'root->log_ctxs'. Since btrfs_log_ctx is allocated from stack memory, it'd get freed with a object alive on the list. then a future list_add will throw the above warning. This returns the correct error in the above case. Jeff also reported this while testing against his fsync error patch set[1]. [1]: https://www.spinics.net/lists/linux-btrfs/msg65308.html "btrfs list corruption and soft lockups while testing writeback error handling" Signed-off-by: Liu Bo <bo.li.liu@oracle.com> --- fs/btrfs/file.c | 1 + fs/btrfs/tree-log.c | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-)