From patchwork Fri Jul 22 04:06:40 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Miao Xie X-Patchwork-Id: 997962 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by demeter1.kernel.org (8.14.4/8.14.4) with ESMTP id p6M3vPh2012716 for ; Fri, 22 Jul 2011 03:57:25 GMT Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752030Ab1GVD5T (ORCPT ); Thu, 21 Jul 2011 23:57:19 -0400 Received: from cn.fujitsu.com ([222.73.24.84]:56982 "EHLO song.cn.fujitsu.com" rhost-flags-OK-FAIL-OK-OK) by vger.kernel.org with ESMTP id S1751722Ab1GVD5T convert rfc822-to-8bit (ORCPT ); Thu, 21 Jul 2011 23:57:19 -0400 Received: from tang.cn.fujitsu.com (tang.cn.fujitsu.com [10.167.250.3]) by song.cn.fujitsu.com (Postfix) with ESMTP id 0FBE317003F; Fri, 22 Jul 2011 11:57:14 +0800 (CST) Received: from mailserver.fnst.cn.fujitsu.com (tang.cn.fujitsu.com [127.0.0.1]) by tang.cn.fujitsu.com (8.14.3/8.13.1) with ESMTP id p6M3vBqI020530; Fri, 22 Jul 2011 11:57:12 +0800 Received: from [10.167.225.64] ([10.167.225.64]) by mailserver.fnst.cn.fujitsu.com (Lotus Domino Release 8.5.1FP4) with ESMTP id 2011072211562018-900312 ; Fri, 22 Jul 2011 11:56:20 +0800 Message-ID: <4E28F750.9060405@cn.fujitsu.com> Date: Fri, 22 Jul 2011 12:06:40 +0800 From: Miao Xie Reply-To: miaox@cn.fujitsu.com User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.9) Gecko/20100921 Fedora/3.1.4-1.fc14 Thunderbird/3.1.4 MIME-Version: 1.0 To: Arne Jansen CC: Chris Mason , Tsutomu Itoh , linux-btrfs , Josef Bacik Subject: Re: new metadata reader/writer locks in integration-test References: <1311096438-sup-1263@shiny> <1311182478-sup-9986@shiny> <4E277757.9070504@jp.fujitsu.com> <4E27BD4F.6020900@gmx.net> <1311295973-sup-3312@shiny> In-Reply-To: <1311295973-sup-3312@shiny> X-MIMETrack: Itemize by SMTP Server on mailserver/fnst(Release 8.5.1FP4|July 25, 2010) at 2011-07-22 11:56:20, Serialize by Router on mailserver/fnst(Release 8.5.1FP4|July 25, 2010) at 2011-07-22 11:56:22 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Greylist: IP, sender and recipient auto-whitelisted, not delayed by milter-greylist-4.2.6 (demeter1.kernel.org [140.211.167.41]); Fri, 22 Jul 2011 03:57:25 +0000 (UTC) On thu, 21 Jul 2011 20:53:24 -0400, Chris Mason wrote: >>>> Hi everyone, >>>> >>>> I just rebased Josef's enospc fixes into integration-test, it should fix >>>> the warnings in extent-tree.c >>>> >>> >>> Unfortunately, I got the following messages. >>> >>> >>> Jul 21 09:41:22 luna kernel: ------------[ cut here ]------------ >>> Jul 21 09:41:22 luna kernel: WARNING: at fs/btrfs/extent-tree.c:5564 btrfs_alloc_reserved_file_extent+0xf8/0x100 [btrfs]() >>> Jul 21 09:41:22 luna kernel: Hardware name: PRIMERGY >>> Jul 21 09:41:22 luna kernel: Modules linked in: btrfs zlib_deflate crc32c libcrc32c autofs4 sunrpc 8021q garp stp llc cpufreq_ondemand acpi_cpufreq freq_table mperf ipv6 ext3 jbd dm_mirror dm_region_hash dm_log dm_mod kvm uinput ppdev parport_pc parport sg pcspkr i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support tg3 shpchp pci_hotplug i3000_edac edac_core ext4 mbcache jbd2 crc16 sd_mod crc_t10dif sr_mod cdrom megaraid_sas floppy pata_acpi ata_generic ata_piix libata scsi_mod [last unloaded: microcode] >>> Jul 21 09:41:22 luna kernel: Pid: 5517, comm: btrfs-endio-wri Tainted: G W 2.6.39btrfs-tc1+ #1 >>> Jul 21 09:41:22 luna kernel: Call Trace: >>> Jul 21 09:41:22 luna kernel: [] warn_slowpath_common+0x7f/0xc0 >>> Jul 21 09:41:22 luna kernel: [] warn_slowpath_null+0x1a/0x20 >>> Jul 21 09:41:22 luna kernel: [] btrfs_alloc_reserved_file_extent+0xf8/0x100 [btrfs] >>> Jul 21 09:41:22 luna kernel: [] insert_reserved_file_extent.clone.0+0x201/0x270 [btrfs] >>> Jul 21 09:41:22 luna kernel: [] btrfs_finish_ordered_io+0x2eb/0x360 [btrfs] >>> Jul 21 09:41:22 luna kernel: [] ? try_to_del_timer_sync+0x83/0xe0 >>> Jul 21 09:41:22 luna kernel: [] btrfs_writepage_end_io_hook+0x50/0xa0 [btrfs] >>> Jul 21 09:41:22 luna kernel: [] end_compressed_bio_write+0x86/0xf0 [btrfs] >>> Jul 21 09:41:22 luna kernel: [] bio_endio+0x1d/0x40 >>> Jul 21 09:41:22 luna kernel: [] end_workqueue_fn+0xf4/0x130 [btrfs] >>> Jul 21 09:41:22 luna kernel: [] worker_loop+0x13e/0x540 [btrfs] >>> Jul 21 09:41:22 luna kernel: [] ? btrfs_queue_worker+0x2d0/0x2d0 [btrfs] >>> Jul 21 09:41:22 luna kernel: [] ? btrfs_queue_worker+0x2d0/0x2d0 [btrfs] >>> Jul 21 09:41:22 luna kernel: [] kthread+0x96/0xa0 >>> Jul 21 09:41:22 luna kernel: [] kernel_thread_helper+0x4/0x10 >>> Jul 21 09:41:22 luna kernel: [] ? kthread_worker_fn+0x1a0/0x1a0 >>> Jul 21 09:41:22 luna kernel: [] ? gs_change+0x13/0x13 >>> Jul 21 09:41:22 luna kernel: ---[ end trace 02c1fa3044677043 ]--- >>> >> >> a very similar warning here, but without compression involved: > > Ok, these are probably the enospc fixes. Could you please try bisecting > out some of Josef's patches? I did binary search and found the following patch led to this problem. commit 97ffc7d564f55787c7d9ea557d5d30d9ecb2f003 Author: Josef Bacik Date: Fri Jul 15 18:29:11 2011 +0000 Btrfs: don't be as agressive with delalloc metadata reservations Currently we reserve enough space to COW an entirely full btree for every ex we have reserved for an inode. This _sucks_, because you only need to COW o and then everybody else is ok. Unfortunately we don't know we'll all be abl get into the same transaction so that's what we have had to do. But the glo reserve holds a reservation large enough to cover a large percentage of all metadata currently in the fs. So all we really need to account for is any n blocks that we may allocate. So fix this by ??…… The reason is the calculation of the reservation is wrong, the nodes in the search path may be split, and new nodes may be created, but the above patch didn't reserve space for these new nodes. The following patch can fix it. Though my test passed, I still need Arne's verification to make sure it can fix all the reported problems. Arne, Could you test it for me? Subject: [PATCH] Btrfs: fix wrong calculation of the reservation for the transaction At worst, Btrfs may split all the nodes in the search path, so we must take those new nodes into account when we calculate the space that need be reserved. Signed-off-by: Miao Xie --- fs/btrfs/ctree.h | 8 +++++++- 1 files changed, 7 insertions(+), 1 deletions(-) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index d813a67..4f23819 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -2133,10 +2133,16 @@ static inline bool btrfs_mixed_space_info(struct btrfs_space_info *space_info) } /* extent-tree.c */ +/* + * This inline function is used to calc the size of new nodes/leaves that we + * may create. At worst, we may split all the nodes in the path and create + * two leaves for the insertion of one item. + */ static inline u64 btrfs_calc_trans_metadata_size(struct btrfs_root *root, unsigned num_items) { - return root->leafsize * 3 * num_items; + return (root->leafsize * 2 + root->nodesize * (BTRFS_MAX_LEVEL - 1)) * + num_items; } void btrfs_put_block_group(struct btrfs_block_group_cache *cache);