From patchwork Tue Jun 14 01:01:53 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Josef Bacik X-Patchwork-Id: 877182 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by demeter2.kernel.org (8.14.4/8.14.4) with ESMTP id p5E11vZ3008717 for ; Tue, 14 Jun 2011 01:01:57 GMT Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755421Ab1FNBBx (ORCPT ); Mon, 13 Jun 2011 21:01:53 -0400 Received: from mx1.redhat.com ([209.132.183.28]:5860 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753888Ab1FNBBw (ORCPT ); Mon, 13 Jun 2011 21:01:52 -0400 Received: from int-mx02.intmail.prod.int.phx2.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id p5E11ptU016168 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Mon, 13 Jun 2011 21:01:51 -0400 Received: from localhost.localdomain (vpn-10-136.rdu.redhat.com [10.11.10.136]) by int-mx02.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id p5E11poI010414; Mon, 13 Jun 2011 21:01:51 -0400 Message-ID: <4DF6B301.90904@redhat.com> Date: Mon, 13 Jun 2011 21:01:53 -0400 From: Josef Bacik User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.17) Gecko/20110428 Fedora/3.1.10-1.fc15 Thunderbird/3.1.10 MIME-Version: 1.0 To: Jim Schutt CC: linux-btrfs@vger.kernel.org, ceph-devel@vger.kernel.org Subject: Re: stalls with latest btrfs merge into 3.0-rc2 References: <4DF67C06.4000307@sandia.gov> In-Reply-To: <4DF67C06.4000307@sandia.gov> X-Scanned-By: MIMEDefang 2.67 on 10.5.11.12 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Greylist: IP, sender and recipient auto-whitelisted, not delayed by milter-greylist-4.2.6 (demeter2.kernel.org [140.211.167.43]); Tue, 14 Jun 2011 01:01:57 +0000 (UTC) On 06/13/2011 05:07 PM, Jim Schutt wrote: > Hi, > > On a system under a heavy write load from multiple ceph OSDs, > I'm running into the following hung tasks where btrfs is implicated. > I'm running commit 3c25fa740e2 from Linus' tree merged with > commit cb9b41c92fa from git://ceph.newdream.net/git/ceph-client.git. > Please try this patch and verify it fixes the problem. If it does I'll make it less crappy and send it along. Thanks, Josef Tested-by: Jim Schutt diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c index 7a9f517..532139e 100644 --- a/fs/btrfs/transaction.c +++ b/fs/btrfs/transaction.c @@ -1236,12 +1236,16 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans, schedule_timeout(1); finish_wait(&cur_trans->writer_wait, &wait); - spin_lock(&root->fs_info->trans_lock); - root->fs_info->trans_no_join = 1; - spin_unlock(&root->fs_info->trans_lock); } while (atomic_read(&cur_trans->num_writers) > 1 || (should_grow && cur_trans->num_joined != joined)); + spin_lock(&root->fs_info->trans_lock); + root->fs_info->trans_no_join = 1; + spin_unlock(&root->fs_info->trans_lock); + + while (atomic_read(&cur_trans->num_writers) > 1) + schedule_timeout(1); + ret = create_pending_snapshots(trans, root->fs_info); BUG_ON(ret);