From patchwork Tue Apr 2 00:46:04 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Josef Bacik X-Patchwork-Id: 2373541 Return-Path: X-Original-To: patchwork-linux-btrfs@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork2.kernel.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by patchwork2.kernel.org (Postfix) with ESMTP id AE1A4DFB7B for ; Tue, 2 Apr 2013 00:46:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757906Ab3DBAqI (ORCPT ); Mon, 1 Apr 2013 20:46:08 -0400 Received: from dkim2.fusionio.com ([66.114.96.54]:33070 "EHLO dkim2.fusionio.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757848Ab3DBAqH (ORCPT ); Mon, 1 Apr 2013 20:46:07 -0400 Received: from mx2.fusionio.com (unknown [10.101.1.160]) by dkim2.fusionio.com (Postfix) with ESMTP id DAF739A0694 for ; Mon, 1 Apr 2013 18:46:06 -0600 (MDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=fusionio.com; s=default; t=1364863566; bh=6ls4pOe1p+F/PE3641wYomRxh5GdnW2ssoLiESJnL6U=; h=From:To:Subject:Date; b=SFLraEqvuzOvA7PJ4WrSIp9khyfBvBen2c3gSW0ArR+4z2Iq/2VAlYYaDj1RKaChQ 9WDz+itkyB5xqeh87vLv5xN7PnACMGqppNBDz/0jfIsWAa0i+f5Ez5xM+7lbyYwHUh l9C/J8JEirbNiANXIiNbYqPK8i8hD55rgRXppzWc= X-ASG-Debug-ID: 1364863566-0421b533523ff30001-6jHSXT Received: from mail1.int.fusionio.com (mail1.int.fusionio.com [10.101.1.21]) by mx2.fusionio.com with ESMTP id Bg8ER0jagzfcWvGO (version=TLSv1 cipher=AES128-SHA bits=128 verify=NO) for ; Mon, 01 Apr 2013 18:46:06 -0600 (MDT) X-Barracuda-Envelope-From: JBacik@fusionio.com Received: from localhost (98.26.82.158) by mail.fusionio.com (10.101.1.19) with Microsoft SMTP Server (TLS) id 8.3.83.0; Mon, 1 Apr 2013 18:46:05 -0600 From: Josef Bacik To: Subject: [PATCH] Btrfs: compare relevant parts of delayed tree refs Date: Mon, 1 Apr 2013 20:46:04 -0400 X-ASG-Orig-Subj: [PATCH] Btrfs: compare relevant parts of delayed tree refs Message-ID: <1364863564-3742-1-git-send-email-jbacik@fusionio.com> X-Mailer: git-send-email 1.7.7.6 MIME-Version: 1.0 X-Barracuda-Connect: mail1.int.fusionio.com[10.101.1.21] X-Barracuda-Start-Time: 1364863566 X-Barracuda-Encrypted: AES128-SHA X-Barracuda-URL: http://10.101.1.181:8000/cgi-mod/mark.cgi X-Virus-Scanned: by bsmtpd at fusionio.com X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using global scores of TAG_LEVEL=1000.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.2.126914 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org A user reported a panic while running a balance. What was happening was he was relocating a block, which added the reference to the relocation tree. Then relocation would walk through the relocation tree and drop that reference and free that block, and then it would walk down a snapshot which referenced the same block and add another ref to the block. The problem is this was all happening in the same transaction, so the parent block was free'ed up when we drop our reference which was immediately available for allocation, and then it was used _again_ to add a reference for the same block from a different snapshot. This resulted in something like this in the delayed ref tree add ref to 90234880, parent=2067398656, ref_root 1766, level 1 del ref to 90234880, parent=2067398656, ref_root 18446744073709551608, level 1 add ref to 90234880, parent=2067398656, ref_root 1767, level 1 as you can see the ref_root's don't match, because when we inc the ref we use the header owner, which is the original tree the block belonged to, instead of the data reloc tree. Then when we remove the extent we use the reloc tree objectid. But none of this matters, since it is a shared reference which means only the parent matters. When the delayed ref stuff runs it adds all the increments first, and then does all the drops, to make sure that we don't delete the ref if we net a positive ref count. But tree blocks aren't allowed to have multiple refs from the same block, so this panics when it tries to add the second ref. We need the add and the drop to cancel each other out in memory so we only do the final add. So to fix this we need to adjust how the delayed refs are added to the tree. Only the ref_root matters when it is a normal backref, and only the parent matters when it is a shared backref. So make our decision based on what ref type we have. This allows us to keep the ref_root in memory in case anybody wants to use it for something else, and it allows the delayed refs to be merged properly so we don't end up with this panic. With this patch the users image no longer panics on mount, and it has a clean fsck after a normal mount/umount cycle. Thanks, Cc: stable@vger.kernel.org Reported-by: Roman Mamedov Signed-off-by: Josef Bacik --- fs/btrfs/delayed-ref.c | 24 ++++++++++++++---------- 1 files changed, 14 insertions(+), 10 deletions(-) diff --git a/fs/btrfs/delayed-ref.c b/fs/btrfs/delayed-ref.c index b7a0641..116abec 100644 --- a/fs/btrfs/delayed-ref.c +++ b/fs/btrfs/delayed-ref.c @@ -40,16 +40,19 @@ struct kmem_cache *btrfs_delayed_extent_op_cachep; * compare two delayed tree backrefs with same bytenr and type */ static int comp_tree_refs(struct btrfs_delayed_tree_ref *ref2, - struct btrfs_delayed_tree_ref *ref1) + struct btrfs_delayed_tree_ref *ref1, int type) { - if (ref1->root < ref2->root) - return -1; - if (ref1->root > ref2->root) - return 1; - if (ref1->parent < ref2->parent) - return -1; - if (ref1->parent > ref2->parent) - return 1; + if (type == BTRFS_TREE_BLOCK_REF_KEY) { + if (ref1->root < ref2->root) + return -1; + if (ref1->root > ref2->root) + return 1; + } else { + if (ref1->parent < ref2->parent) + return -1; + if (ref1->parent > ref2->parent) + return 1; + } return 0; } @@ -113,7 +116,8 @@ static int comp_entry(struct btrfs_delayed_ref_node *ref2, if (ref1->type == BTRFS_TREE_BLOCK_REF_KEY || ref1->type == BTRFS_SHARED_BLOCK_REF_KEY) { return comp_tree_refs(btrfs_delayed_node_to_tree_ref(ref2), - btrfs_delayed_node_to_tree_ref(ref1)); + btrfs_delayed_node_to_tree_ref(ref1), + ref1->type); } else if (ref1->type == BTRFS_EXTENT_DATA_REF_KEY || ref1->type == BTRFS_SHARED_DATA_REF_KEY) { return comp_data_refs(btrfs_delayed_node_to_data_ref(ref2),