Message ID | 20130130163832.GH3660@localhost.localdomain (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 01/30/2013 09:38 AM, Josef Bacik wrote: > On Tue, Jan 29, 2013 at 04:05:17PM -0700, Jim Schutt wrote: >> > On 01/29/2013 01:04 PM, Josef Bacik wrote: >>> > > On Tue, Jan 29, 2013 at 11:41:10AM -0700, Jim Schutt wrote: >>>>> > >> > On 01/28/2013 02:23 PM, Josef Bacik wrote: >>>>>>> > >>> > > On Thu, Jan 03, 2013 at 11:44:46AM -0700, Jim Schutt wrote: >>>>>>>>> > >>>> > >> Hi Josef, >>>>>>>>> > >>>> > >> >>>>>>>>> > >>>> > >> Thanks for the patch - sorry for the long delay in testing... >>>>>>>>> > >>>> > >> >>>>>>> > >>> > > >>>>>>> > >>> > > Jim, >>>>>>> > >>> > > >>>>>>> > >>> > > I've been trying to reason out how this happens, could you do a btrfs fi df on >>>>>>> > >>> > > the filesystem thats giving you trouble so I can see if what I think is >>>>>>> > >>> > > happening is what's actually happening. Thanks, >>>>> > >> > >>>>> > >> > Here's an example, using a slightly different kernel than >>>>> > >> > my previous report. It's your btrfs-next master branch >>>>> > >> > (commit 8f139e59d5 "Btrfs: use bit operation for ->fs_state") >>>>> > >> > with ceph 3.8 for-linus (commit 0fa6ebc600 from linus' tree). >>>>> > >> > >>>>> > >> > >>>>> > >> > Here I'm finding the file system in question: >>>>> > >> > >>>>> > >> > # ls -l /dev/mapper | grep dm-93 >>>>> > >> > lrwxrwxrwx 1 root root 8 Jan 29 11:13 cs53s19p2 -> ../dm-93 >>>>> > >> > >>>>> > >> > # df -h | grep -A 1 cs53s19p2 >>>>> > >> > /dev/mapper/cs53s19p2 >>>>> > >> > 896G 1.1G 896G 1% /ram/mnt/ceph/data.osd.522 >>>>> > >> > >>>>> > >> > >>>>> > >> > Here's the info you asked for: >>>>> > >> > >>>>> > >> > # btrfs fi df /ram/mnt/ceph/data.osd.522 >>>>> > >> > Data: total=2.01GB, used=1.00GB >>>>> > >> > System: total=4.00MB, used=64.00KB >>>>> > >> > Metadata: total=8.00MB, used=7.56MB >>>>> > >> > >>> > > How big is the disk you are using, and what mount options? I have a patch to >>> > > keep the panic from happening and hopefully the abort, could you try this? I >>> > > still want to keep the underlying error from happening because it shouldn't be, >>> > > but no reason I can't fix the error case while you can easily reproduce it :). >>> > > Thanks, >>> > > >>> > > Josef >>> > > >>> > >>From c50b725c74c7d39064e553ef85ac9753efbd8aec Mon Sep 17 00:00:00 2001 >>> > > From: Josef Bacik <jbacik@fusionio.com> >>> > > Date: Tue, 29 Jan 2013 15:03:37 -0500 >>> > > Subject: [PATCH] Btrfs: fix chunk allocation error handling >>> > > >>> > > If we error out allocating a dev extent we will have already created the >>> > > block group and such which will cause problems since the allocator may have >>> > > tried to allocate out of the block group that no longer exists. This will >>> > > cause BUG_ON()'s in the bio submission path. This also makes a failure to >>> > > allocate a dev extent a non-abort error, we will just clean up the dev >>> > > extents we did allocate and exit. Now if we fail to delete the dev extents >>> > > we will abort since we can't have half of the dev extents hanging around, >>> > > but this will make us much less likely to abort. Thanks, >>> > > >>> > > Signed-off-by: Josef Bacik <jbacik@fusionio.com> >>> > > --- >> > >> > Interesting - with your patch applied I triggered the following, just >> > bringing up a fresh Ceph filesystem - I didn't even get a chance to >> > mount it on my Ceph clients: >> > > Ok can you give this patch a whirl as well? It seems to fix the problem for me. With this patch on top of your previous patch, after several trials of my test I am also unable to reproduce the issue. Since I had been having trouble first time, every time, I think it also seems to fix the problem for me. Thanks again! -- Jim > Thanks, > > Josef -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jan 30, 2013 at 02:37:40PM -0700, Jim Schutt wrote: > On 01/30/2013 09:38 AM, Josef Bacik wrote: > > On Tue, Jan 29, 2013 at 04:05:17PM -0700, Jim Schutt wrote: > >> > On 01/29/2013 01:04 PM, Josef Bacik wrote: > >>> > > On Tue, Jan 29, 2013 at 11:41:10AM -0700, Jim Schutt wrote: > >>>>> > >> > On 01/28/2013 02:23 PM, Josef Bacik wrote: > >>>>>>> > >>> > > On Thu, Jan 03, 2013 at 11:44:46AM -0700, Jim Schutt wrote: > >>>>>>>>> > >>>> > >> Hi Josef, > >>>>>>>>> > >>>> > >> > >>>>>>>>> > >>>> > >> Thanks for the patch - sorry for the long delay in testing... > >>>>>>>>> > >>>> > >> > >>>>>>> > >>> > > > >>>>>>> > >>> > > Jim, > >>>>>>> > >>> > > > >>>>>>> > >>> > > I've been trying to reason out how this happens, could you do a btrfs fi df on > >>>>>>> > >>> > > the filesystem thats giving you trouble so I can see if what I think is > >>>>>>> > >>> > > happening is what's actually happening. Thanks, > >>>>> > >> > > >>>>> > >> > Here's an example, using a slightly different kernel than > >>>>> > >> > my previous report. It's your btrfs-next master branch > >>>>> > >> > (commit 8f139e59d5 "Btrfs: use bit operation for ->fs_state") > >>>>> > >> > with ceph 3.8 for-linus (commit 0fa6ebc600 from linus' tree). > >>>>> > >> > > >>>>> > >> > > >>>>> > >> > Here I'm finding the file system in question: > >>>>> > >> > > >>>>> > >> > # ls -l /dev/mapper | grep dm-93 > >>>>> > >> > lrwxrwxrwx 1 root root 8 Jan 29 11:13 cs53s19p2 -> ../dm-93 > >>>>> > >> > > >>>>> > >> > # df -h | grep -A 1 cs53s19p2 > >>>>> > >> > /dev/mapper/cs53s19p2 > >>>>> > >> > 896G 1.1G 896G 1% /ram/mnt/ceph/data.osd.522 > >>>>> > >> > > >>>>> > >> > > >>>>> > >> > Here's the info you asked for: > >>>>> > >> > > >>>>> > >> > # btrfs fi df /ram/mnt/ceph/data.osd.522 > >>>>> > >> > Data: total=2.01GB, used=1.00GB > >>>>> > >> > System: total=4.00MB, used=64.00KB > >>>>> > >> > Metadata: total=8.00MB, used=7.56MB > >>>>> > >> > > >>> > > How big is the disk you are using, and what mount options? I have a patch to > >>> > > keep the panic from happening and hopefully the abort, could you try this? I > >>> > > still want to keep the underlying error from happening because it shouldn't be, > >>> > > but no reason I can't fix the error case while you can easily reproduce it :). > >>> > > Thanks, > >>> > > > >>> > > Josef > >>> > > > >>> > >>From c50b725c74c7d39064e553ef85ac9753efbd8aec Mon Sep 17 00:00:00 2001 > >>> > > From: Josef Bacik <jbacik@fusionio.com> > >>> > > Date: Tue, 29 Jan 2013 15:03:37 -0500 > >>> > > Subject: [PATCH] Btrfs: fix chunk allocation error handling > >>> > > > >>> > > If we error out allocating a dev extent we will have already created the > >>> > > block group and such which will cause problems since the allocator may have > >>> > > tried to allocate out of the block group that no longer exists. This will > >>> > > cause BUG_ON()'s in the bio submission path. This also makes a failure to > >>> > > allocate a dev extent a non-abort error, we will just clean up the dev > >>> > > extents we did allocate and exit. Now if we fail to delete the dev extents > >>> > > we will abort since we can't have half of the dev extents hanging around, > >>> > > but this will make us much less likely to abort. Thanks, > >>> > > > >>> > > Signed-off-by: Josef Bacik <jbacik@fusionio.com> > >>> > > --- > >> > > >> > Interesting - with your patch applied I triggered the following, just > >> > bringing up a fresh Ceph filesystem - I didn't even get a chance to > >> > mount it on my Ceph clients: > >> > > > Ok can you give this patch a whirl as well? It seems to fix the problem for me. > > With this patch on top of your previous patch, after several trials of > my test I am also unable to reproduce the issue. Since I had been > having trouble first time, every time, I think it also seems to fix > the problem for me. > > Thanks again! > Awesome thanks for testing! Josef -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index dca5679..874bcf2 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -3677,8 +3677,18 @@ static int can_overcommit(struct btrfs_root *root, u64 used; used = space_info->bytes_used + space_info->bytes_reserved + - space_info->bytes_pinned + space_info->bytes_readonly + - space_info->bytes_may_use; + space_info->bytes_pinned + space_info->bytes_readonly; + + /* + * We only want to allow over committing if we have lots of actual space + * free, but if we've tied up more than 80% of the space with actual + * space reservation (not including bytes we _might_ use) then don't + * allow overcommitting as it will just make things go badly for us. + */ + if (used > div_factor(space_info->total_bytes, 8)) + return 0; + + used += space_info->bytes_may_use; spin_lock(&root->fs_info->free_chunk_lock); avail = root->fs_info->free_chunk_space;