From patchwork Tue Dec 18 13:52:42 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Josef Bacik X-Patchwork-Id: 1892031 Return-Path: X-Original-To: patchwork-linux-btrfs@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork1.kernel.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by patchwork1.kernel.org (Postfix) with ESMTP id 32AF73FCD4 for ; Tue, 18 Dec 2012 13:52:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754563Ab2LRNws (ORCPT ); Tue, 18 Dec 2012 08:52:48 -0500 Received: from mx1.fusionio.com ([66.114.96.30]:39819 "EHLO mx1.fusionio.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754352Ab2LRNwr (ORCPT ); Tue, 18 Dec 2012 08:52:47 -0500 X-ASG-Debug-ID: 1355838765-03d6a5317e1da50001-6jHSXT Received: from mail1.int.fusionio.com (mail1.int.fusionio.com [10.101.1.21]) by mx1.fusionio.com with ESMTP id k6AvAQqESXqAmpGt (version=TLSv1 cipher=AES128-SHA bits=128 verify=NO); Tue, 18 Dec 2012 06:52:45 -0700 (MST) X-Barracuda-Envelope-From: JBacik@fusionio.com Received: from localhost (24.211.209.217) by mail.fusionio.com (10.101.1.19) with Microsoft SMTP Server (TLS) id 8.3.83.0; Tue, 18 Dec 2012 06:52:45 -0700 Date: Tue, 18 Dec 2012 08:52:42 -0500 From: Josef Bacik To: Liu Bo CC: "linux-btrfs@vger.kernel.org" , Jim Schutt Subject: Re: [PATCH] Btrfs: fix a deadlock on chunk mutex Message-ID: <20121218135242.GC2403@localhost.localdomain> X-ASG-Orig-Subj: Re: [PATCH] Btrfs: fix a deadlock on chunk mutex References: <1355363557-2962-1-git-send-email-bo.li.liu@oracle.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <1355363557-2962-1-git-send-email-bo.li.liu@oracle.com> User-Agent: Mutt/1.5.21 (2011-07-01) X-Barracuda-Connect: mail1.int.fusionio.com[10.101.1.21] X-Barracuda-Start-Time: 1355838765 X-Barracuda-Encrypted: AES128-SHA X-Barracuda-URL: http://10.101.1.180:8000/cgi-mod/mark.cgi X-Virus-Scanned: by bsmtpd at fusionio.com X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using per-user scores of TAG_LEVEL=1000.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.2.117363 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org On Wed, Dec 12, 2012 at 06:52:37PM -0700, Liu Bo wrote: > An user reported that he has hit an annoying deadlock while playing with > ceph based on btrfs. > > Current updating device tree requires space from METADATA chunk, > so we -may- need to do a recursive chunk allocation when adding/updating > dev extent, that is where the deadlock comes from. > > If we use SYSTEM metadata to update device tree, we can avoid the recursive > stuff. > This is going to cause us to allocate much more system chunks than we used to which could land us in trouble. Instead let's just keep us from re-entering if we're already allocating a chunk. We do the chunk allocation when we don't have enough space for a cluster, but we'll likely have plenty of space to make an allocation. Can you give this patch a try Jim and see if it fixes your problem? Thanks, Josef --- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index e152809..59df5e7 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -3564,6 +3564,10 @@ static int do_chunk_alloc(struct btrfs_trans_handle *trans, int wait_for_alloc = 0; int ret = 0; + /* Don't re-enter if we're already allocating a chunk */ + if (trans->allocating_chunk) + return -ENOSPC; + space_info = __find_space_info(extent_root->fs_info, flags); if (!space_info) { ret = update_space_info(extent_root->fs_info, flags, @@ -3606,6 +3610,8 @@ again: goto again; } + trans->allocating_chunk = true; + /* * If we have mixed data/metadata chunks we want to make sure we keep * allocating mixed chunks instead of individual chunks. @@ -3632,6 +3638,7 @@ again: check_system_chunk(trans, extent_root, flags); ret = btrfs_alloc_chunk(trans, extent_root, flags); + trans->allocating_chunk = false; if (ret < 0 && ret != -ENOSPC) goto out; diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c index e6509b9..47ad8be 100644 --- a/fs/btrfs/transaction.c +++ b/fs/btrfs/transaction.c @@ -388,6 +388,7 @@ again: h->qgroup_reserved = qgroup_reserved; h->delayed_ref_elem.seq = 0; h->type = type; + h->allocating_chunk = false; INIT_LIST_HEAD(&h->qgroup_ref_list); INIT_LIST_HEAD(&h->new_bgs); diff --git a/fs/btrfs/transaction.h b/fs/btrfs/transaction.h index 0e8aa1e..69700f7 100644 --- a/fs/btrfs/transaction.h +++ b/fs/btrfs/transaction.h @@ -68,6 +68,7 @@ struct btrfs_trans_handle { struct btrfs_block_rsv *orig_rsv; short aborted; short adding_csums; + bool allocating_chunk; enum btrfs_trans_type type; /* * this root is only needed to validate that the root passed to