From patchwork Fri Jul 15 10:34:36 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Miao Xie X-Patchwork-Id: 977722 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by demeter1.kernel.org (8.14.4/8.14.4) with ESMTP id p6FAPaA8020630 for ; Fri, 15 Jul 2011 10:25:36 GMT Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965175Ab1GOKZd (ORCPT ); Fri, 15 Jul 2011 06:25:33 -0400 Received: from cn.fujitsu.com ([222.73.24.84]:63597 "EHLO song.cn.fujitsu.com" rhost-flags-OK-FAIL-OK-OK) by vger.kernel.org with ESMTP id S965046Ab1GOKZc (ORCPT ); Fri, 15 Jul 2011 06:25:32 -0400 Received: from tang.cn.fujitsu.com (tang.cn.fujitsu.com [10.167.250.3]) by song.cn.fujitsu.com (Postfix) with ESMTP id 5102017016C; Fri, 15 Jul 2011 18:25:29 +0800 (CST) Received: from mailserver.fnst.cn.fujitsu.com (tang.cn.fujitsu.com [127.0.0.1]) by tang.cn.fujitsu.com (8.14.3/8.13.1) with ESMTP id p6FAPTj5027561; Fri, 15 Jul 2011 18:25:29 +0800 Received: from [10.167.225.64] ([10.167.225.64]) by mailserver.fnst.cn.fujitsu.com (Lotus Domino Release 8.5.1FP4) with ESMTP id 2011071518244374-845664 ; Fri, 15 Jul 2011 18:24:43 +0800 Message-ID: <4E2017BC.8060903@cn.fujitsu.com> Date: Fri, 15 Jul 2011 18:34:36 +0800 From: Miao Xie Reply-To: miaox@cn.fujitsu.com User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.9) Gecko/20100921 Fedora/3.1.4-1.fc14 Thunderbird/3.1.4 MIME-Version: 1.0 To: Linux Btrfs CC: Chris Mason Subject: [PATCH] Btrfs: fix BUG_ON() caused by ENOSPC when relocating space X-MIMETrack: Itemize by SMTP Server on mailserver/fnst(Release 8.5.1FP4|July 25, 2010) at 2011-07-15 18:24:43, Serialize by Router on mailserver/fnst(Release 8.5.1FP4|July 25, 2010) at 2011-07-15 18:24:44, Serialize complete at 2011-07-15 18:24:44 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Greylist: IP, sender and recipient auto-whitelisted, not delayed by milter-greylist-4.2.6 (demeter1.kernel.org [140.211.167.41]); Fri, 15 Jul 2011 10:25:36 +0000 (UTC) When we balanced the chunks across the devices, BUG_ON() in __finish_chunk_alloc() was triggered. ------------[ cut here ]------------ kernel BUG at fs/btrfs/volumes.c:2568! [SNIP] Call Trace: [] btrfs_alloc_chunk+0x8e/0xa0 [btrfs] [] do_chunk_alloc+0x330/0x3a0 [btrfs] [] btrfs_reserve_extent+0xb4/0x1f0 [btrfs] [] btrfs_alloc_free_block+0xdb/0x350 [btrfs] [] ? read_extent_buffer+0xd8/0x1d0 [btrfs] [] __btrfs_cow_block+0x14d/0x5e0 [btrfs] [] ? read_block_for_search+0x14d/0x4d0 [btrfs] [] btrfs_cow_block+0x10b/0x240 [btrfs] [] btrfs_search_slot+0x49e/0x7a0 [btrfs] [] btrfs_insert_empty_items+0x8d/0xf0 [btrfs] [] insert_with_overflow+0x43/0x110 [btrfs] [] btrfs_insert_dir_item+0xcd/0x1f0 [btrfs] [] ? map_extent_buffer+0xb0/0xc0 [btrfs] [] ? rb_insert_color+0x9d/0x160 [] ? inode_tree_add+0xf0/0x150 [btrfs] [] btrfs_add_link+0xc1/0x1c0 [btrfs] [] ? security_inode_init_security+0x1c/0x30 [] ? btrfs_init_acl+0x4a/0x180 [btrfs] [] btrfs_add_nondir+0x2f/0x70 [btrfs] [] ? btrfs_init_inode_security+0x46/0x60 [btrfs] [] btrfs_create+0x150/0x1d0 [btrfs] [] ? generic_permission+0x23/0xb0 [] vfs_create+0xa5/0xc0 [] do_last+0x5fe/0x880 [] path_openat+0xcd/0x3d0 [] do_filp_open+0x49/0xa0 [] ? alloc_fd+0x95/0x160 [] do_sys_open+0x107/0x1e0 [] ? audit_syscall_entry+0x1bf/0x1f0 [] sys_open+0x20/0x30 [] system_call_fastpath+0x16/0x1b [SNIP] RIP [] __finish_chunk_alloc+0x20a/0x220 [btrfs] The reason is: Task1 Space balance task do_chunk_alloc() __finish_chunk_alloc() update device info in the chunk tree alloc system metadata block relocate system metadata block group set system metadata block group readonly, This block group is the only one that can allocate space. So there is no free space that can be allocated now. find no space and don't try to alloc new chunk, and then return ENOSPC BUG_ON() in __finish_chunk_alloc() was triggered. Fix this bug by allocating a new system metadata chunk before relocating the old one if we find there is no free space which can be allocated after setting the old block group to be read-only. Reported-by: Tsutomu Itoh Signed-off-by: Miao Xie Tested-by: Tsutomu Itoh --- fs/btrfs/extent-tree.c | 28 +++++++++++++++++++++------- 1 files changed, 21 insertions(+), 7 deletions(-) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 71cd456..00c8a1a 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -6524,15 +6524,28 @@ static u64 update_block_group_flags(struct btrfs_root *root, u64 flags) return flags; } -static int set_block_group_ro(struct btrfs_block_group_cache *cache) +static int set_block_group_ro(struct btrfs_block_group_cache *cache, int force) { struct btrfs_space_info *sinfo = cache->space_info; u64 num_bytes; + u64 min_allocable_bytes; int ret = -ENOSPC; if (cache->ro) return 0; + /* + * We need some metadata space and system metadata space for + * allocating chunks in some corner cases until we force to set + * it to be readonly. + */ + if ((sinfo->flags & + (BTRFS_BLOCK_GROUP_SYSTEM | BTRFS_BLOCK_GROUP_METADATA)) && + !force) + min_allocable_bytes = 1 * 1024 * 1024; + else + min_allocable_bytes = 0; + spin_lock(&sinfo->lock); spin_lock(&cache->lock); num_bytes = cache->key.offset - cache->reserved - cache->pinned - @@ -6540,7 +6553,8 @@ static int set_block_group_ro(struct btrfs_block_group_cache *cache) if (sinfo->bytes_used + sinfo->bytes_reserved + sinfo->bytes_pinned + sinfo->bytes_may_use + sinfo->bytes_readonly + - cache->reserved_pinned + num_bytes <= sinfo->total_bytes) { + cache->reserved_pinned + num_bytes + min_allocable_bytes <= + sinfo->total_bytes) { sinfo->bytes_readonly += num_bytes; sinfo->bytes_reserved += cache->reserved_pinned; cache->reserved_pinned = 0; @@ -6571,7 +6585,7 @@ int btrfs_set_block_group_ro(struct btrfs_root *root, do_chunk_alloc(trans, root, 2 * 1024 * 1024, alloc_flags, CHUNK_ALLOC_FORCE); - ret = set_block_group_ro(cache); + ret = set_block_group_ro(cache, 0); if (!ret) goto out; alloc_flags = get_alloc_profile(root, cache->space_info->flags); @@ -6579,7 +6593,7 @@ int btrfs_set_block_group_ro(struct btrfs_root *root, CHUNK_ALLOC_FORCE); if (ret < 0) goto out; - ret = set_block_group_ro(cache); + ret = set_block_group_ro(cache, 0); out: btrfs_end_transaction(trans, root); return ret; @@ -7016,7 +7030,7 @@ int btrfs_read_block_groups(struct btrfs_root *root) set_avail_alloc_bits(root->fs_info, cache->flags); if (btrfs_chunk_readonly(root, cache->key.objectid)) - set_block_group_ro(cache); + set_block_group_ro(cache, 1); } list_for_each_entry_rcu(space_info, &root->fs_info->space_info, list) { @@ -7030,9 +7044,9 @@ int btrfs_read_block_groups(struct btrfs_root *root) * mirrored block groups. */ list_for_each_entry(cache, &space_info->block_groups[3], list) - set_block_group_ro(cache); + set_block_group_ro(cache, 1); list_for_each_entry(cache, &space_info->block_groups[4], list) - set_block_group_ro(cache); + set_block_group_ro(cache, 1); } init_global_block_rsv(info);