From patchwork Wed Aug 3 18:07:42 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Sterba X-Patchwork-Id: 1032152 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by demeter1.kernel.org (8.14.4/8.14.4) with ESMTP id p73I80Ys014616 for ; Wed, 3 Aug 2011 18:08:00 GMT Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755065Ab1HCSH4 (ORCPT ); Wed, 3 Aug 2011 14:07:56 -0400 Received: from cantor2.suse.de ([195.135.220.15]:49555 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754719Ab1HCSH4 (ORCPT ); Wed, 3 Aug 2011 14:07:56 -0400 Received: from relay2.suse.de (charybdis-ext.suse.de [195.135.221.2]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx2.suse.de (Postfix) with ESMTP id B98148EE65; Wed, 3 Aug 2011 20:07:54 +0200 (CEST) Received: by ds.suse.cz (Postfix, from userid 10065) id A578A747D2; Wed, 3 Aug 2011 20:07:53 +0200 (CEST) From: David Sterba To: linux-btrfs@vger.kernel.org Cc: chris.mason@oracle.com, josef@redhat.com, miaox@cn.fujitsu.com, David Sterba Subject: [RFC, crash][PATCH] btrfs: allow cross-subvolume file clone Date: Wed, 3 Aug 2011 20:07:42 +0200 Message-Id: <1312394862-28143-1-git-send-email-dsterba@suse.cz> X-Mailer: git-send-email 1.7.6.233.gd79bc Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Greylist: IP, sender and recipient auto-whitelisted, not delayed by milter-greylist-4.2.6 (demeter1.kernel.org [140.211.167.41]); Wed, 03 Aug 2011 18:08:00 +0000 (UTC) Hi, I'm working on a patch to fix cross-volume cloning, worked for simple cases like cloning a single file. When I cloned a full linux-2.6 tree there was a immediate BUG_ON (after third cloned file) in btrfs_delayed_update_inode with -ENOSPC : [ 925.546266] ------------[ cut here ]------------ [ 925.549921] kernel BUG at fs/btrfs/delayed-inode.c:1693! [ 925.549921] invalid opcode: 0000 [#1] SMP [ 925.549921] CPU 0 [ 925.549921] Modules linked in: btrfs [ 925.549921] [ 925.549921] Pid: 31167, comm: clone-file Not tainted 3.0.0-default+ #98 Intel Corporation Santa Rosa platform/Matanzas [ 925.549921] RIP: 0010:[] [] btrfs_delayed_update_inode+0x2e0/0x2f0 [btrfs] [ 925.549921] RSP: 0018:ffff88004f229be8 EFLAGS: 00010286 [ 925.549921] RAX: 00000000ffffffe4 RBX: ffff880048392c70 RCX: 0000000000018000 [ 925.549921] RDX: 0000000000001b1a RSI: 0000000000000001 RDI: ffff88007a6f8420 [ 925.549921] RBP: ffff88004f229c28 R08: 0000000000000004 R09: 0000000000000000 [ 925.549921] R10: 0000000000000000 R11: 0000000000000000 R12: ffff880048393bf8 [ 925.549921] R13: ffff880048392cb8 R14: ffff880050ff3540 R15: ffff880052940000 [ 925.549921] FS: 00007fbf18b23700(0000) GS:ffff88007dc00000(0000) knlGS:0000000000000000 [ 925.549921] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 925.549921] CR2: 00007fbcc68ba000 CR3: 000000004b4a8000 CR4: 00000000000006f0 [ 925.549921] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 925.549921] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 925.549921] Process clone-file (pid: 31167, threadinfo ffff88004f228000, task ffff88004b4e5140) [ 925.549921] Stack: [ 925.549921] ffff880048f7ddc0 0000000000018000 ffff88004f229c38 ffff880048393bf8 [ 925.549921] ffff880050ff3540 ffff880048393bf8 ffff880051a900a0 ffff880052940000 [ 925.549921] ffff88004f229c78 ffffffffa0034633 ffff88004f229c58 ffffffffa005f08b [ 925.549921] Call Trace: [ 925.549921] [] btrfs_update_inode+0x53/0x160 [btrfs] [ 925.549921] [] ? btrfs_tree_unlock+0x6b/0xa0 [btrfs] [ 925.549921] [] btrfs_ioctl_clone+0xa0a/0xcc0 [btrfs] [ 925.549921] [] ? __do_fault+0x4a1/0x590 [ 925.549921] [] ? lock_release_holdtime+0x3d/0x1c0 [ 925.549921] [] ? do_page_fault+0x2d0/0x580 [ 925.549921] [] btrfs_ioctl+0x2db/0xda0 [btrfs] [ 925.549921] [] ? do_page_fault+0x2d0/0x580 [ 925.549921] [] ? debug_check_no_locks_freed+0x177/0x180 [ 925.549921] [] ? kmem_cache_free+0xb5/0x1b0 [ 925.549921] [] do_vfs_ioctl+0x98/0x570 [ 925.549921] [] ? fget_light+0x2fd/0x3c0 [ 925.549921] [] sys_ioctl+0x4f/0x80 [ 925.549921] [] system_call_fastpath+0x16/0x1b [ 925.549921] Code: e8 06 00 00 8d 0c 49 48 89 ca 48 89 4d c8 e8 c8 0f fa ff 85 c0 48 8b 4d c8 75 10 48 89 4b 08 e9 8e fd ff ff 0f 1f 80 00 00 00 00 <0f> 0b 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 53 [ 925.549921] RIP [] btrfs_delayed_update_inode+0x2e0/0x2f0 [btrfs] [ 925.549921] RSP [ 925.876182] ---[ end trace 8b4c2031e1394913 ]--- the patch has been applied on top of current linus which contains patches from both pull requests (ed8f37370d83). The filesystem consists of 5 devices 23G each, about 100G of usable space, mkfs.btrfs with defaults. The kernel tree has about 6G: $ btrfs fi df . Data, RAID0: total=10.00GB, used=5.55GB Data: total=8.00MB, used=0.00 System, RAID1: total=8.00MB, used=4.00KB System: total=4.00MB, used=0.00 Metadata, RAID1: total=1.50GB, used=121.75MB Metadata: total=8.00MB, used=0.00 $ df -h . Filesystem Size Used Avail Use% Mounted on /dev/sda5 110G 5.8G 82G 7% /mnt/sda5 ie. plenty of free space. It's possible that I've omitted some important bits in the patch itself, or this exposes a bug of ENOSPC or delayed-inode. david --- From: David Sterba Lift the EXDEV condition and allow different root trees for files being cloned, then pass source inode's root when searching for extents. Signed-off-by: David Sterba --- fs/btrfs/ioctl.c | 7 ++++--- 1 files changed, 4 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 0b980af..58eb0ef 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -2183,7 +2183,7 @@ static noinline long btrfs_ioctl_clone(struct file *file, unsigned long srcfd, goto out_fput; ret = -EXDEV; - if (src->i_sb != inode->i_sb || BTRFS_I(src)->root != root) + if (src->i_sb != inode->i_sb) goto out_fput; ret = -ENOMEM; @@ -2247,13 +2247,14 @@ static noinline long btrfs_ioctl_clone(struct file *file, unsigned long srcfd, * note the key will change type as we walk through the * tree. */ - ret = btrfs_search_slot(NULL, root, &key, path, 0, 0); + ret = btrfs_search_slot(NULL, BTRFS_I(src)->root, &key, path, + 0, 0); if (ret < 0) goto out; nritems = btrfs_header_nritems(path->nodes[0]); if (path->slots[0] >= nritems) { - ret = btrfs_next_leaf(root, path); + ret = btrfs_next_leaf(BTRFS_I(src)->root, path); if (ret < 0) goto out; if (ret > 0)