From patchwork Tue Feb 12 22:10:40 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Josef Bacik X-Patchwork-Id: 2132411 Return-Path: X-Original-To: patchwork-linux-btrfs@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork2.kernel.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by patchwork2.kernel.org (Postfix) with ESMTP id 79C53E00D9 for ; Tue, 12 Feb 2013 22:11:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757971Ab3BLWKp (ORCPT ); Tue, 12 Feb 2013 17:10:45 -0500 Received: from dkim1.fusionio.com ([66.114.96.53]:36167 "EHLO dkim1.fusionio.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757828Ab3BLWKn (ORCPT ); Tue, 12 Feb 2013 17:10:43 -0500 Received: from mx2.fusionio.com (unknown [10.101.1.160]) by dkim1.fusionio.com (Postfix) with ESMTP id AA6D77C0416 for ; Tue, 12 Feb 2013 15:10:42 -0700 (MST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=fusionio.com; s=default; t=1360707042; bh=Mot7BAxWOzPLfCFDQhKIGsqjPh2SbLanIx0bLZmyLdE=; h=Date:From:To:CC:Subject; b=A035OLMgBdQLh3z+cIUUiiZsxmxVAh8Vt5BwkK03Ps0jsSaJd40YC5ULxcjLVJrEd JY5sSnAbmeP/+imfbK/f8a1q3S7mu0WqnCN3h6/U2b02VG3xhCRWVRCMmcdxZ4durj UoB4RNZoMrRUEyfCY5JXMjhSERr6Ajgc1cv5jCHo= X-ASG-Debug-ID: 1360707041-0421b542060ac10001-6jHSXT Received: from mail1.int.fusionio.com (mail1.int.fusionio.com [10.101.1.21]) by mx2.fusionio.com with ESMTP id J2ISzannGpztfd6X (version=TLSv1 cipher=AES128-SHA bits=128 verify=NO); Tue, 12 Feb 2013 15:10:41 -0700 (MST) X-Barracuda-Envelope-From: JBacik@fusionio.com Received: from localhost (98.26.82.158) by mail.fusionio.com (10.101.1.19) with Microsoft SMTP Server (TLS) id 8.3.83.0; Tue, 12 Feb 2013 15:10:41 -0700 Date: Tue, 12 Feb 2013 17:10:40 -0500 From: Josef Bacik To: CC: , Subject: [RFC] fix async ordered operations flush deadlock Message-ID: <20130212221040.GA1247@localhost.localdomain> X-ASG-Orig-Subj: [RFC] fix async ordered operations flush deadlock MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.21 (2011-07-01) X-Barracuda-Connect: mail1.int.fusionio.com[10.101.1.21] X-Barracuda-Start-Time: 1360707041 X-Barracuda-Encrypted: AES128-SHA X-Barracuda-URL: http://10.101.1.181:8000/cgi-mod/mark.cgi X-Virus-Scanned: by bsmtpd at fusionio.com X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using global scores of TAG_LEVEL=1000.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests=BSF_SC0_MISMATCH_TO X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.2.122491 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- 0.00 BSF_SC0_MISMATCH_TO Envelope rcpt doesn't match header Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Hello, So btrfs_commit_transaction does this ret = btrfs_run_ordered_operations(root, 0); which async flushes all inodes on the ordered operations list. The problem with this is that we wait for this flushing to finish, so we end up with this Task 1 Task 2 Task 3 start transaction set trans_no_join wait forever commit btrfs_run_ordered_operations async flush inode cow_file_range join_transaction wait forever wait forever Task1 is waiting for the flushint to finish, task 2 is waiting for task 1 to give up its num_writers, and task 3 is waiting to join the transaction. This used to work fine because the flushing was done inline so we just took on the current journal info of the guy who managed to race in and get a ref on the transaction, but now we've gotten rid of that by doing it async. Here is a basic bullshit patch that just moves the flushing below the "is somebody else committing right now?" logic which will hopefully fix the problem, but it's a shit patch but its 5:10 and I need to go make Liam dinner. I'll try to think of a better solution between now and tomorrow, but I'm open to suggestions. Thanks, Josef --- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c index 49c79b3..8c50495 100644 --- a/fs/btrfs/transaction.c +++ b/fs/btrfs/transaction.c @@ -1480,13 +1480,6 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans, int should_grow = 0; unsigned long now = get_seconds(); - ret = btrfs_run_ordered_operations(root, 0); - if (ret) { - btrfs_abort_transaction(trans, root, ret); - btrfs_end_transaction(trans, root); - return ret; - } - /* Stop the commit early if ->aborted is set */ if (unlikely(ACCESS_ONCE(cur_trans->aborted))) { ret = cur_trans->aborted; @@ -1541,6 +1534,10 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans, spin_unlock(&cur_trans->commit_lock); wake_up(&root->fs_info->transaction_blocked_wait); + ret = btrfs_run_ordered_operations(root, 0); + if (ret) + goto cleanup_transaction; + spin_lock(&root->fs_info->trans_lock); if (cur_trans->list.prev != &root->fs_info->trans_list) { prev_trans = list_entry(cur_trans->list.prev,