From patchwork Tue Aug 13 08:33:16 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xue jiufei X-Patchwork-Id: 2843509 Return-Path: X-Original-To: patchwork-ocfs2-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork2.web.kernel.org (Postfix) with ESMTP id E3F79BF546 for ; Tue, 13 Aug 2013 08:37:49 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 9680D20441 for ; Tue, 13 Aug 2013 08:37:48 +0000 (UTC) Received: from userp1040.oracle.com (userp1040.oracle.com [156.151.31.81]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 1A68D20440 for ; Tue, 13 Aug 2013 08:37:47 +0000 (UTC) Received: from acsinet21.oracle.com (acsinet21.oracle.com [141.146.126.237]) by userp1040.oracle.com (Sentrion-MTA-4.3.1/Sentrion-MTA-4.3.1) with ESMTP id r7D8b7lv029370 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Tue, 13 Aug 2013 08:37:09 GMT Received: from oss.oracle.com (oss-external.oracle.com [137.254.96.51]) by acsinet21.oracle.com (8.14.4+Sun/8.14.4) with ESMTP id r7D8b2Dw000114 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 13 Aug 2013 08:37:02 GMT Received: from localhost ([127.0.0.1] helo=oss.oracle.com) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1V9A6A-0002Wo-Cb; Tue, 13 Aug 2013 01:37:02 -0700 Received: from acsinet22.oracle.com ([141.146.126.238]) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1V9A5b-0002WC-Nw for ocfs2-devel@oss.oracle.com; Tue, 13 Aug 2013 01:36:27 -0700 Received: from userp1020.oracle.com (userp1020.oracle.com [156.151.31.79]) by acsinet22.oracle.com (8.14.4+Sun/8.14.4) with ESMTP id r7D8aQZa018514 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Tue, 13 Aug 2013 08:36:27 GMT Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [119.145.14.64]) by userp1020.oracle.com (Sentrion-MTA-4.3.1/Sentrion-MTA-4.3.1) with ESMTP id r7D8aO1M028615 (version=TLSv1/SSLv3 cipher=DES-CBC3-SHA bits=168 verify=FAIL) for ; Tue, 13 Aug 2013 08:36:25 GMT Received: from 172.24.2.119 (EHLO szxeml209-edg.china.huawei.com) ([172.24.2.119]) by szxrg01-dlp.huawei.com (MOS 4.3.4-GA FastPath queued) with ESMTP id BGO31993; Tue, 13 Aug 2013 16:33:42 +0800 (CST) Received: from SZXEML422-HUB.china.huawei.com (10.82.67.161) by szxeml209-edg.china.huawei.com (172.24.2.184) with Microsoft SMTP Server (TLS) id 14.1.323.7; Tue, 13 Aug 2013 16:33:41 +0800 Received: from [127.0.0.1] (10.135.72.87) by szxeml422-hub.china.huawei.com (10.82.67.161) with Microsoft SMTP Server id 14.1.323.7; Tue, 13 Aug 2013 16:33:39 +0800 Message-ID: <5209EF4C.4040403@huawei.com> Date: Tue, 13 Aug 2013 16:33:16 +0800 From: Xue jiufei User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:17.0) Gecko/20130801 Thunderbird/17.0.8 MIME-Version: 1.0 To: Andrew Morton X-Originating-IP: [10.135.72.87] X-CFilter-Loop: Reflected X-Flow-Control-Info: class=Pass-to-MM reputation=ipRisk-All ip=119.145.14.64 ct-class=R3 ct-vol1=0 ct-vol2=6 ct-vol3=5 ct-risk=10 ct-spam1=0 ct-spam2=0 ct-bulk=88 rcpts=1 size=3190 X-Sendmail-CM-Score: 0.00% X-Sendmail-CM-Analysis: v=2.1 cv=P+gD2Ewu c=1 sm=1 tr=0 a=6YMIiCGU//PUEf62qScyBg==:117 a=6YMIiCGU//PUEf62qScyBg==:17 a=7dTNpnL_XZEA:10 a=_ucFIcivCngA:10 a=xIsUATygPekA:10 a=O9dq5j03pVQA:10 a=-6GgXGw8a1AA:10 a=8nJEP1OIZ-IA:10 a=i0EeH86SAAAA:8 a=GBvfMEGEchoA:10 a =WBRdC1RZeK9ueVFUBRsA:9 a=wPNLvfGTeEIA:10 a=hPjdaMEvmhQA:10 X-Sendmail-CT-Classification: not spam X-Sendmail-CT-RefID: str=0001.0A090202.5209F00A.00A0:SCFSTAT1612107, ss=1, re=-4.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 Cc: Mark Fasheh , ocfs2-devel@oss.oracle.com Subject: [Ocfs2-devel] [PATCH] ocfs2/dlm: fix possible convertion deadlock X-BeenThere: ocfs2-devel@oss.oracle.com X-Mailman-Version: 2.1.9 Precedence: list Reply-To: xuejiufei@huawei.com List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: ocfs2-devel-bounces@oss.oracle.com Errors-To: ocfs2-devel-bounces@oss.oracle.com X-Source-IP: acsinet21.oracle.com [141.146.126.237] X-Spam-Status: No, score=-7.0 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP We found there is a convertion deadlock when the owner of lockres happened to crash before send DLM_PROXY_AST_MSG for a downconverting lock. The situation is as follows: Node1 Node2 Node3 the owner of lockresA lock_1 granted at EX mode and call ocfs2_cluster_unlock to decrease ex_holders. converting lock_3 from NL to EX send DLM_PROXY_AST_MSG to Node1, asking Node 1 to downconvert. receiving DLM_PROXY_AST_MSG, thread ocfs2dc send DLM_CONVERT_LOCK_MSG to Node2 to downconvert lock_1(EX->NL). lock_1 can be granted and put it into pending_asts list, return DLM_NORMAL. then something happened and Node2 crashed. received DLM_NORMAL, waiting for DLM_PROXY_AST_MSG. selected as the recovery master, receving migrate lock from Node1, queue lock_1 to the tail of converting list. After dlm recovery, converting list in the master of lockresA(Node3) will be: converting list head <-> lock_3(NL->EX) <->lock_1(EX<->NL). Requested mode of lock_3 is not compatible with the granted mode of lock_1, so it can not be granted. and lock_1 can not downconvert because covnerting queue is strictly FIFO. So a deadlock is created. We think function dlm_process_recovery_data() should queue_ast for lock_1 or alter the order of lock_1 and lock_3, so dlm_thread can process lock_1 first. And if there are multiple downconverting locks, they must convert form PR to NL, so no need to sort them. Signed-off-by: joyce --- fs/ocfs2/dlm/dlmrecovery.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c index 773bd32..6cdbbe7 100644 --- a/fs/ocfs2/dlm/dlmrecovery.c +++ b/fs/ocfs2/dlm/dlmrecovery.c @@ -1963,7 +1963,17 @@ skip_lvb: } if (!bad) { dlm_lock_get(newlock); - list_add_tail(&newlock->list, queue); + if (mres->flags & DLM_MRES_RECOVERY && + ml->list == DLM_CONVERTING_LIST && + newlock->ml.type > newlock->ml.convert_type) { + mlog(0, "%s:%.*s: newlock %u:%llu is doing downconvert, " + "add it to the head of converting list\n", + dlm->name, res->lockname.len, res->lockname.name, + dlm_get_lock_cookie_node(be64_to_cpu(ml->cookie)), + dlm_get_lock_cookie_seq(be64_to_cpu(ml->cookie))); + list_add(&newlock->list, queue); + } else + list_add_tail(&newlock->list, queue); mlog(0, "%s:%.*s: added lock for node %u, " "setting refmap bit\n", dlm->name, res->lockname.len, res->lockname.name, ml->node);