Message ID | CAKcLGm_jFa_2FaaaudRZF8JRymh1t3ASMFN1tFGbs0PGt1vNaQ@mail.gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Wed, Aug 21, 2013 at 08:44:55AM -0500, Mitch Harder wrote: > On Thu, Aug 15, 2013 at 12:29 PM, Mitch Harder > <mitch.harder@sabayonlinux.org> wrote: > > I'm running into a curious problem. > > > > In the process of making my script portable, I am breaking the ability > > to replicate the error. > > > > I'm trying to isolate the aspect of my local script that is triggering > > the error. No firm insights yet. > > > > > > On Tue, Aug 13, 2013 at 11:03 AM, Mitch Harder > > <mitch.harder@sabayonlinux.org> wrote: > >> Let me work on making that script more portable, and hopefully quicker > >> to reproduce. > >> > >> On Tue, Aug 13, 2013 at 9:15 AM, Josef Bacik <jbacik@fusionio.com> wrote: > >>> On Mon, Aug 12, 2013 at 11:06:27PM -0500, Mitch Harder wrote: > >>>> I'm hitting a btrfs Kernel BUG running a snapshot stress script with > >>>> linux-3.11.0-rc5. > >>>> > >>> > >>> I can haz script? Thanks, > >>> > > I've had a hard time assembling a portable reproducer for this issue. > > I discovered that my reproducer was highly dependent on a local > archive of out-of-date git kernel sources. My efforts to reproduce > the error with a portable set of scripts with publicly available > kernel git sources weren't successful. > > It seems like this issue is related to a corner-case workload that is > difficult to reproduce. > > So I've bisected the error I was seeing with my local script, and > identified the following commit as triggering my issue: > > commit: 3c64a1aba7cfcb04f79e76f859b3d66660275d59 > Btrfs: cleanup: don't check the same thing twice > https://git.kernel.org/cgit/linux/kernel/git/mason/linux-btrfs.git/commit/fs/btrfs?h=for-linus&id=3c64a1aba7cfcb04 > > I tested a kernel which reverted this change, and also added WARN_ON > lines to provide a back trace. > Well that works too :). I'll look at this when I get back from the doctor in a few hours and see if I can't figure out why it started happening. Thanks, Josef -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Aug 21, 2013 at 08:44:55AM -0500, Mitch Harder wrote: > On Thu, Aug 15, 2013 at 12:29 PM, Mitch Harder > <mitch.harder@sabayonlinux.org> wrote: > > I'm running into a curious problem. > > > > In the process of making my script portable, I am breaking the ability > > to replicate the error. > > > > I'm trying to isolate the aspect of my local script that is triggering > > the error. No firm insights yet. > > > > > > On Tue, Aug 13, 2013 at 11:03 AM, Mitch Harder > > <mitch.harder@sabayonlinux.org> wrote: > >> Let me work on making that script more portable, and hopefully quicker > >> to reproduce. > >> > >> On Tue, Aug 13, 2013 at 9:15 AM, Josef Bacik <jbacik@fusionio.com> wrote: > >>> On Mon, Aug 12, 2013 at 11:06:27PM -0500, Mitch Harder wrote: > >>>> I'm hitting a btrfs Kernel BUG running a snapshot stress script with > >>>> linux-3.11.0-rc5. > >>>> > >>> > >>> I can haz script? Thanks, > >>> > > I've had a hard time assembling a portable reproducer for this issue. > > I discovered that my reproducer was highly dependent on a local > archive of out-of-date git kernel sources. My efforts to reproduce > the error with a portable set of scripts with publicly available > kernel git sources weren't successful. > > It seems like this issue is related to a corner-case workload that is > difficult to reproduce. > > So I've bisected the error I was seeing with my local script, and > identified the following commit as triggering my issue: > > commit: 3c64a1aba7cfcb04f79e76f859b3d66660275d59 > Btrfs: cleanup: don't check the same thing twice > https://git.kernel.org/cgit/linux/kernel/git/mason/linux-btrfs.git/commit/fs/btrfs?h=for-linus&id=3c64a1aba7cfcb04 > > I tested a kernel which reverted this change, and also added WARN_ON > lines to provide a back trace. > > diff --git a/fs/btrfs/export.c b/fs/btrfs/export.c > index 4b86916..336d628 100644 > --- a/fs/btrfs/export.c > +++ b/fs/btrfs/export.c > @@ -82,6 +82,12 @@ static struct dentry *btrfs_get_dentry(struct > super_block *sb, u64 objectid, > goto fail; > } > > + if (btrfs_root_refs(&root->root_item) == 0) { > + WARN_ON(1); > + err = -ENOENT; > + goto fail; > + } > + > key.objectid = objectid; > btrfs_set_key_type(&key, BTRFS_INODE_ITEM_KEY); > key.offset = 0; > diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c > index 94413af..4010257 100644 > --- a/fs/btrfs/file.c > +++ b/fs/btrfs/file.c > @@ -310,6 +310,12 @@ static int __btrfs_run_defrag_inode(struct > btrfs_fs_info *fs_info, > goto cleanup; > } > > + if (btrfs_root_refs(&inode_root->root_item) == 0) { > + WARN_ON(1); > + ret = -ENOENT; > + goto cleanup; > + } > + Funnily enough I just added this check back in a different commit. Now that I look at the reasoning tho this cleanup patch was wrong. We do check if root_refs is 0 in btrfs_read_fs_root_no_name, but only if the root isn't already in cache. If it is in cache we will happily return it with no issue. So either we should add the extra check for the in-cache case (probably a good idea), or go back and add all of these checks back. Thanks, Josef -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 21 Aug 2013 08:44:55 -0500, Mitch Harder wrote: > I've had a hard time assembling a portable reproducer for this issue. > > I discovered that my reproducer was highly dependent on a local > archive of out-of-date git kernel sources. My efforts to reproduce > the error with a portable set of scripts with publicly available > kernel git sources weren't successful. > > It seems like this issue is related to a corner-case workload that is > difficult to reproduce. > > So I've bisected the error I was seeing with my local script, and > identified the following commit as triggering my issue: > > commit: 3c64a1aba7cfcb04f79e76f859b3d66660275d59 > Btrfs: cleanup: don't check the same thing twice > https://git.kernel.org/cgit/linux/kernel/git/mason/linux-btrfs.git/commit/fs/btrfs?h=for-linus&id=3c64a1aba7cfcb04 > > I tested a kernel which reverted this change, and also added WARN_ON > lines to provide a back trace. [...] > diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c > index cd46e2c..a1091f7 100644 > --- a/fs/btrfs/inode.c > +++ b/fs/btrfs/inode.c > @@ -2302,6 +2302,12 @@ static noinline int > relink_extent_backref(struct btrfs_path *path, > return 0; > return PTR_ERR(root); > } > + if (btrfs_root_refs(&root->root_item) == 0) { > + srcu_read_unlock(&fs_info->subvol_srcu, index); > + /* parse ENOENT to 0 */ > + WARN_ON(1); > + return 0; > + } [...] > [ 1616.886868] ------------[ cut here ]------------ > [ 1616.886912] WARNING: at fs/btrfs/inode.c:2308 relink_extent_backref+0x103/0x721 [btrfs]() > [ 1616.887050] Call Trace: > [ 1616.887064] [<ffffffff8161a34a>] dump_stack+0x19/0x1b > [ 1616.887071] [<ffffffff8103035a>] warn_slowpath_common+0x67/0x80 > [ 1616.887077] [<ffffffff8103038d>] warn_slowpath_null+0x1a/0x1c > [ 1616.887100] [<ffffffffa019ea82>] relink_extent_backref+0x103/0x721 > [ 1616.887205] [<ffffffffa019f7e2>] btrfs_finish_ordered_io+0x742/0x829 Mitch, Thank you for this excellent work to find the cause of the issue. I've sent a patch "Btrfs: fix for patch "cleanup: don't check the same thing twice"" and would appreciate if you could repeat your test, just to make sure, because I was never able to reproduce this issue myself. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Aug 23, 2013 at 3:48 AM, Stefan Behrens <sbehrens@giantdisaster.de> wrote: > On Wed, 21 Aug 2013 08:44:55 -0500, Mitch Harder wrote: >> I've had a hard time assembling a portable reproducer for this issue. >> >> I discovered that my reproducer was highly dependent on a local >> archive of out-of-date git kernel sources. My efforts to reproduce >> the error with a portable set of scripts with publicly available >> kernel git sources weren't successful. >> >> It seems like this issue is related to a corner-case workload that is >> difficult to reproduce. >> >> So I've bisected the error I was seeing with my local script, and >> identified the following commit as triggering my issue: >> >> commit: 3c64a1aba7cfcb04f79e76f859b3d66660275d59 >> Btrfs: cleanup: don't check the same thing twice >> https://git.kernel.org/cgit/linux/kernel/git/mason/linux-btrfs.git/commit/fs/btrfs?h=for-linus&id=3c64a1aba7cfcb04 >> >> I tested a kernel which reverted this change, and also added WARN_ON >> lines to provide a back trace. > [...] >> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c >> index cd46e2c..a1091f7 100644 >> --- a/fs/btrfs/inode.c >> +++ b/fs/btrfs/inode.c >> @@ -2302,6 +2302,12 @@ static noinline int >> relink_extent_backref(struct btrfs_path *path, >> return 0; >> return PTR_ERR(root); >> } >> + if (btrfs_root_refs(&root->root_item) == 0) { >> + srcu_read_unlock(&fs_info->subvol_srcu, index); >> + /* parse ENOENT to 0 */ >> + WARN_ON(1); >> + return 0; >> + } > [...] >> [ 1616.886868] ------------[ cut here ]------------ >> [ 1616.886912] WARNING: at fs/btrfs/inode.c:2308 relink_extent_backref+0x103/0x721 [btrfs]() >> [ 1616.887050] Call Trace: >> [ 1616.887064] [<ffffffff8161a34a>] dump_stack+0x19/0x1b >> [ 1616.887071] [<ffffffff8103035a>] warn_slowpath_common+0x67/0x80 >> [ 1616.887077] [<ffffffff8103038d>] warn_slowpath_null+0x1a/0x1c >> [ 1616.887100] [<ffffffffa019ea82>] relink_extent_backref+0x103/0x721 >> [ 1616.887205] [<ffffffffa019f7e2>] btrfs_finish_ordered_io+0x742/0x829 > > Mitch, > > Thank you for this excellent work to find the cause of the issue. I've sent a patch "Btrfs: fix for patch "cleanup: don't check the same thing twice"" and would appreciate if you could repeat your test, just to make sure, because I was never able to reproduce this issue myself. > Thanks. I've tested my "special" workload with your patch on the latest 3.11_rc6 kernel, and the patch corrects the errors I was encountering. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/btrfs/export.c b/fs/btrfs/export.c index 4b86916..336d628 100644 --- a/fs/btrfs/export.c +++ b/fs/btrfs/export.c @@ -82,6 +82,12 @@ static struct dentry *btrfs_get_dentry(struct super_block *sb, u64 objectid, goto fail; } + if (btrfs_root_refs(&root->root_item) == 0) { + WARN_ON(1); + err = -ENOENT; + goto fail; + } + key.objectid = objectid; btrfs_set_key_type(&key, BTRFS_INODE_ITEM_KEY); key.offset = 0; diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 94413af..4010257 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -310,6 +310,12 @@ static int __btrfs_run_defrag_inode(struct btrfs_fs_info *fs_info, goto cleanup; } + if (btrfs_root_refs(&inode_root->root_item) == 0) { + WARN_ON(1); + ret = -ENOENT; + goto cleanup; + } + key.objectid = defrag->ino; btrfs_set_key_type(&key, BTRFS_INODE_ITEM_KEY); key.offset = 0; diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index cd46e2c..a1091f7 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -2302,6 +2302,12 @@ static noinline int relink_extent_backref(struct btrfs_path *path, return 0; return PTR_ERR(root); } + if (btrfs_root_refs(&root->root_item) == 0) { + srcu_read_unlock(&fs_info->subvol_srcu, index); + /* parse ENOENT to 0 */ + WARN_ON(1); + return 0; + } /* step 2: get inode */ key.objectid = backref->inum; @@ -4703,6 +4709,12 @@ static int fixup_tree_root_location(struct btrfs_root *root, goto out; } + if (btrfs_root_refs(&new_root->root_item) == 0) { + WARN_ON(1); + err = -ENOENT; + goto out; + } + *sub_root = new_root; location->objectid = btrfs_root_dirid(&new_root->root_item); location->type = BTRFS_INODE_ITEM_KEY; diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 0e17a30..0f74235 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -2969,6 +2969,12 @@ static long btrfs_ioctl_default_subvol(struct file *file, void __user *argp) goto out; } + if (btrfs_root_refs(&new_root->root_item) == 0) { + WARN_ON(1); + ret = -ENOENT; + goto out; + } + path = btrfs_alloc_path(); if (!path) { ret = -ENOMEM; diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index b267c3c..3cf4716 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -793,6 +793,11 @@ find_root: if (IS_ERR(new_root)) return ERR_CAST(new_root); + if (btrfs_root_refs(&new_root->root_item) == 0) { + WARN_ON(1); + return ERR_PTR(-ENOENT); + } + dir_id = btrfs_root_dirid(&new_root->root_item); setup_root: location.objectid = dir_id;