From patchwork Thu Apr 2 17:25:11 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 6150721 Return-Path: X-Original-To: patchwork-linux-btrfs@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 2023CBF4A6 for ; Thu, 2 Apr 2015 17:26:12 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 0C9F0203AE for ; Thu, 2 Apr 2015 17:26:11 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id CA51E203AD for ; Thu, 2 Apr 2015 17:26:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753146AbbDBR0F (ORCPT ); Thu, 2 Apr 2015 13:26:05 -0400 Received: from victor.provo.novell.com ([137.65.250.26]:33320 "EHLO prv3-mh.provo.novell.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752704AbbDBR0E (ORCPT ); Thu, 2 Apr 2015 13:26:04 -0400 Received: from debian3.lan (prv-ext-foundry1int.gns.novell.com [137.65.251.240]) by prv3-mh.provo.novell.com with ESMTP (NOT encrypted); Thu, 02 Apr 2015 11:25:48 -0600 From: Filipe Manana To: linux-btrfs@vger.kernel.org Cc: Filipe Manana Subject: [PATCH v2] Btrfs: fix range cloning when same inode used as source and destination Date: Thu, 2 Apr 2015 18:25:11 +0100 Message-Id: <1427995511-13028-1-git-send-email-fdmanana@suse.com> X-Mailer: git-send-email 2.1.3 In-Reply-To: <1427810206-23103-1-git-send-email-fdmanana@suse.com> References: <1427810206-23103-1-git-send-email-fdmanana@suse.com> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP While searching for extents to clone we might find one where we only use a part of it coming from its tail. If our destination inode is the same the source inode, we end up removing the tail part of the extent item and insert after a new one that point to the same extent with an adjusted key file offset and data offset. After this we search for the next extent item in the fs/subvol tree with a key that has an offset incremented by one. But this second search leaves us at the new extent item we inserted previously, and since that extent item has a non-zero data offset, it it can make us call btrfs_drop_extents with an empty range (start == end) which causes the following warning: [23978.537119] WARNING: CPU: 6 PID: 16251 at fs/btrfs/file.c:550 btrfs_drop_extent_cache+0x43/0x385 [btrfs]() (...) [23978.557266] Call Trace: [23978.557978] [] dump_stack+0x4c/0x65 [23978.559191] [] warn_slowpath_common+0xa1/0xbb [23978.560699] [] ? btrfs_drop_extent_cache+0x43/0x385 [btrfs] [23978.562389] [] warn_slowpath_null+0x1a/0x1c [23978.563613] [] btrfs_drop_extent_cache+0x43/0x385 [btrfs] [23978.565103] [] ? time_hardirqs_off+0x15/0x28 [23978.566294] [] ? trace_hardirqs_off+0xd/0xf [23978.567438] [] __btrfs_drop_extents+0x6b/0x9e1 [btrfs] [23978.568702] [] ? trace_hardirqs_on+0xd/0xf [23978.569763] [] ? ____cache_alloc+0x69/0x2eb [23978.570817] [] ? virt_to_head_page+0x9/0x36 [23978.571872] [] ? cache_alloc_debugcheck_after.isra.42+0x16c/0x1cb [23978.573466] [] ? kmemleak_alloc_recursive.constprop.52+0x16/0x18 [23978.574962] [] btrfs_drop_extents+0x66/0x7f [btrfs] [23978.576179] [] btrfs_clone+0x516/0xaf5 [btrfs] [23978.577311] [] ? lock_extent_range+0x7b/0xcd [btrfs] [23978.578520] [] btrfs_ioctl_clone+0x28e/0x39f [btrfs] [23978.580282] [] btrfs_ioctl+0xb51/0x219a [btrfs] (...) [23978.591887] ---[ end trace 988ec2a653d03ed3 ]--- Then we attempt to insert a new extent item with a key that already exists, which makes btrfs_insert_empty_item return -EEXIST resulting in abortion of the current transaction: [23978.594355] WARNING: CPU: 6 PID: 16251 at fs/btrfs/super.c:260 __btrfs_abort_transaction+0x52/0x114 [btrfs]() (...) [23978.622589] Call Trace: [23978.623181] [] dump_stack+0x4c/0x65 [23978.624359] [] warn_slowpath_common+0xa1/0xbb [23978.625573] [] ? __btrfs_abort_transaction+0x52/0x114 [btrfs] [23978.626971] [] warn_slowpath_fmt+0x46/0x48 [23978.628003] [] ? vprintk_default+0x1d/0x1f [23978.629138] [] __btrfs_abort_transaction+0x52/0x114 [btrfs] [23978.630528] [] btrfs_clone+0x7fc/0xaf5 [btrfs] [23978.631635] [] ? lock_extent_range+0x7b/0xcd [btrfs] [23978.632886] [] btrfs_ioctl_clone+0x28e/0x39f [btrfs] [23978.634119] [] btrfs_ioctl+0xb51/0x219a [btrfs] (...) [23978.647714] ---[ end trace 988ec2a653d03ed4 ]--- This is wrong because we should not process the extent item that we just inserted previously, and instead process the extent item that follows it in the tree For example for the test case I wrote for fstests: bs=$((64 * 1024)) mkfs.btrfs -f -l $bs -O ^no-holes /dev/sdc mount /dev/sdc /mnt xfs_io -f -c "pwrite -S 0xaa $(($bs * 2)) $(($bs * 2))" /mnt/foo $CLONER_PROG -s $((3 * $bs)) -d $((267 * $bs)) -l 0 /mnt/foo /mnt/foo $CLONER_PROG -s $((217 * $bs)) -d $((95 * $bs)) -l 0 /mnt/foo /mnt/foo The second clone call fails with -EEXIST, because when we process the first extent item (offset 262144), we drop part of it (counting from the end) and then insert a new extent item with a key greater then the key we found. The next time we search the tree we search for a key with offset 262144 + 1, which leaves us at the new extent item we have just inserted but we think it refers to an extent that we need to clone. Fix this by ensuring the next search key uses an offset corresponding to the offset of the key we found previously plus the data length of the corresponding extent item. This ensures we skip new extent items that we inserted and works for the case of implicit holes too (NO_HOLES feature). A test case for fstests follows soon. Signed-off-by: Filipe Manana --- V2: Fixed a warning about potentially uninitialized variable. David got this warning on a 4.5.1 gcc, but I didn't on a 4.9.2 gcc however. fs/btrfs/ioctl.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 09a566a..029f4da 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -3254,6 +3254,7 @@ process_slot: u64 datao = 0, datal = 0; u8 comp; u64 drop_start; + u64 next_key_min_offset; extent = btrfs_item_ptr(leaf, slot, struct btrfs_file_extent_item); @@ -3285,7 +3286,7 @@ process_slot: } else if (key.offset >= off + len) { break; } - + next_key_min_offset = key.offset + datal; size = btrfs_item_size_nr(leaf, slot); read_extent_buffer(leaf, buf, btrfs_item_ptr_offset(leaf, slot), @@ -3498,9 +3499,11 @@ process_slot: goto out; if (new_key.offset + datal >= destoff + len) break; + key.offset = next_key_min_offset; + } else { + key.offset++; } btrfs_release_path(path); - key.offset++; } ret = 0;