From patchwork Mon Jun 9 20:04:00 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 4323371 Return-Path: X-Original-To: patchwork-ocfs2-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 206DDBEEAA for ; Mon, 9 Jun 2014 20:04:45 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 2E10B201BB for ; Mon, 9 Jun 2014 20:04:44 +0000 (UTC) Received: from aserp1040.oracle.com (aserp1040.oracle.com [141.146.126.69]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 09B5120212 for ; Mon, 9 Jun 2014 20:04:42 +0000 (UTC) Received: from acsinet21.oracle.com (acsinet21.oracle.com [141.146.126.237]) by aserp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id s59K4W8o001091 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Mon, 9 Jun 2014 20:04:33 GMT Received: from oss.oracle.com (oss-external.oracle.com [137.254.96.51]) by acsinet21.oracle.com (8.14.4+Sun/8.14.4) with ESMTP id s59K4WHL024087 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 9 Jun 2014 20:04:32 GMT Received: from localhost ([127.0.0.1] helo=oss.oracle.com) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1Wu5o0-0000AL-An; Mon, 09 Jun 2014 13:04:32 -0700 Received: from acsinet22.oracle.com ([141.146.126.238]) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1Wu5nY-00008T-9B for ocfs2-devel@oss.oracle.com; Mon, 09 Jun 2014 13:04:04 -0700 Received: from userp1030.oracle.com (userp1030.oracle.com [156.151.31.80]) by acsinet22.oracle.com (8.14.4+Sun/8.14.4) with ESMTP id s59K43T0012948 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 9 Jun 2014 20:04:04 GMT Received: from mail-vc0-f202.google.com (mail-vc0-f202.google.com [209.85.220.202]) by userp1030.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id s59K42mw028735 (version=TLSv1/SSLv3 cipher=RC4-SHA bits=128 verify=OK) for ; Mon, 9 Jun 2014 20:04:03 GMT Received: by mail-vc0-f202.google.com with SMTP id id10so432295vcb.5 for ; Mon, 09 Jun 2014 13:04:02 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:subject:to:cc:from:date:mime-version :content-type:content-transfer-encoding:message-id; bh=qBHFQg8AZHoP44TrRwimeBgUFzZbjt4lkIAUO1j9G3U=; b=koLsA0aUz1dtrUGYaPi2gKuIdStvS2A/gsG1kCEnvgZIMHSkSSalL+VwH2I+dOTyF3 KKTsfJl8f6rncS5lNhX495qQw5+/PSwyCI0BQ87kvyugsWrCrbP5bm3QLFsLf1jA8aVK xlnRNOwrkp5Y1fEan5b+Iw2nCEeTqQ/kO9wTAr1pcIAWgcKI7xlTiT8EmiGc2bvbqIOl 62X54gX3mpwLQkT9Lu2zfBU+6bJx32PjgUR1fw5tQAFmI64BMTIZfYeu33uQ97NDo0km OtzUsMIAoyzWsv78AFeNpjKZ+2C4VoSRQQPVqxKz2S+BoTyA4dKAjv7EL5FzA43J1wC9 oSnw== X-Gm-Message-State: ALoCoQkntRmTFw6GPZeVkvNOpTHVKIsLEvmNO+hFdDNHQ+vX6SXhZ/ScaRkWDeHR3dd5NkvThiHj X-Received: by 10.58.227.10 with SMTP id rw10mr13319774vec.39.1402344241899; Mon, 09 Jun 2014 13:04:01 -0700 (PDT) Received: from corp2gmr1-2.hot.corp.google.com (corp2gmr1-2.hot.corp.google.com [172.24.189.93]) by gmr-mx.google.com with ESMTPS id c22si1113248yhe.1.2014.06.09.13.04.01 for (version=TLSv1.1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Mon, 09 Jun 2014 13:04:01 -0700 (PDT) Received: from localhost.localdomain (akpm3.mtv.corp.google.com [172.17.131.127]) by corp2gmr1-2.hot.corp.google.com (Postfix) with ESMTP id 3D4AB5A4A6F; Mon, 9 Jun 2014 13:04:01 -0700 (PDT) To: ocfs2-devel@oss.oracle.com From: akpm@linux-foundation.org Date: Mon, 09 Jun 2014 13:04:00 -0700 MIME-Version: 1.0 Message-Id: <20140609200401.3D4AB5A4A6F@corp2gmr1-2.hot.corp.google.com> X-Flow-Control-Info: class=Pass-to-MM reputation=ipRisk-All ip=209.85.220.202 ct-class=R6 ct-vol1=-66 ct-vol2=7 ct-vol3=6 ct-risk=62 ct-spam1=100 ct-spam2=10 ct-bulk=0 rcpts=1 size=4264 X-Sendmail-CM-Score: 0.00% X-Sendmail-CM-Analysis: v=2.1 cv=IqInLtPg c=1 sm=1 tr=0 a=Itp+hB3mddH9AGJZOjxiLw==:117 a=vtkQ-14MFVkA:10 a=NEiEQogP1MkA:10 a=os2CZ2fo8YAA:10 a=Z4Rwk6OoAAAA:8 a=1XWaLZrsAAAA:8 a=yPCof4ZbAAAA:8 a=iox4zFpeAAAA:8 a=IXr_WNlcAAAA:8 a=tfknRcP8BvVyKriQDTwA:9 a=e4xtJxf3 HDoA:10 a=7DSvI1NPTFQA:10 a=n9GBPR9yFnkA:10 a=T5ZRoNnfl4MA:10 a=jbrJJM5MRmoA:10 X-Sendmail-CT-RefID: str=0001.0A090206.53961333.007F, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0 X-Sendmail-CT-Classification: not spam Cc: mfasheh@suse.com Subject: [Ocfs2-devel] [patch 2/8] ocfs2: fix deadlock when two nodes are converting same lock from PR to EX and idletimeout closes conn X-BeenThere: ocfs2-devel@oss.oracle.com X-Mailman-Version: 2.1.9 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: ocfs2-devel-bounces@oss.oracle.com Errors-To: ocfs2-devel-bounces@oss.oracle.com X-Source-IP: acsinet21.oracle.com [141.146.126.237] X-Spam-Status: No, score=-4.8 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Tariq Saeed Subject: ocfs2: fix deadlock when two nodes are converting same lock from PR to EX and idletimeout closes conn Orabug: 18639535 Two node cluster and both nodes hold a lock at PR level and both want to convert to EX at the same time. Master node 1 has sent BAST and then closes the connection due to idletime out. Node 0 receives BAST, sends unlock req with cancel flag but gets error -ENOTCONN. The problem is this error is ignored in dlm_send_remote_unlock_request() on the **incorrect** assumption that the master is dead. See NOTE in comment why it returns DLM_NORMAL. Upon getting DLM_NORMAL, node 0 proceeds to sends convert (without cancel flg) which fails with -ENOTCONN. waits 5 sec and resends. This time gets DLM_IVLOCKID from the master since lock not found in grant , it had been moved to converting queue in response to conv PR->EX req. No way out. Node 1 (master) Node 0 Reviewed-by: Mark Fasheh ============== ====== lock mode PR PR convert PR -> EX mv grant -> convert and que BAST ... <-------- convert PR -> EX convert que looks like this: ((node 1, PR -> EX) (node 0, PR -> EX)) ... BAST (want PR -> NL) ------------------> ... idle timout, conn closed ... In response to BAST, sends unlock with cancel convert flag gets -ENOTCONN. Ignores and sends remote convert request gets -ENOTCONN, waits 5 Sec, retries ... reconnects <----------------- convert req goes through on next try does not find lock on grant que status DLM_IVLOCKID ------------------> ... No way out. Fix is to keep retrying unlock with cancel flag until it succeeds or the master dies. Signed-off-by: Tariq Saeed Cc: Mark Fasheh Cc: Joel Becker Signed-off-by: Andrew Morton --- fs/ocfs2/dlm/dlmunlock.c | 18 +++++++++++++----- 1 file changed, 13 insertions(+), 5 deletions(-) diff -puN fs/ocfs2/dlm/dlmunlock.c~deadlock-when-two-nodes-are-converting-same-lock-from-pr-to-ex-and-idletimeout-closes-conn fs/ocfs2/dlm/dlmunlock.c --- a/fs/ocfs2/dlm/dlmunlock.c~deadlock-when-two-nodes-are-converting-same-lock-from-pr-to-ex-and-idletimeout-closes-conn +++ a/fs/ocfs2/dlm/dlmunlock.c @@ -191,7 +191,9 @@ static enum dlm_status dlmunlock_common( DLM_UNLOCK_CLEAR_CONVERT_TYPE); } else if (status == DLM_RECOVERING || status == DLM_MIGRATING || - status == DLM_FORWARD) { + status == DLM_FORWARD || + status == DLM_NOLOCKMGR + ) { /* must clear the actions because this unlock * is about to be retried. cannot free or do * any list manipulation. */ @@ -200,7 +202,8 @@ static enum dlm_status dlmunlock_common( res->lockname.name, status==DLM_RECOVERING?"recovering": (status==DLM_MIGRATING?"migrating": - "forward")); + (status == DLM_FORWARD ? "forward" : + "nolockmanager"))); actions = 0; } if (flags & LKM_CANCEL) @@ -364,7 +367,10 @@ static enum dlm_status dlm_send_remote_u * updated state to the recovery master. this thread * just needs to finish out the operation and call * the unlockast. */ - ret = DLM_NORMAL; + if (dlm_is_node_dead(dlm, owner)) + ret = DLM_NORMAL; + else + ret = DLM_NOLOCKMGR; } else { /* something bad. this will BUG in ocfs2 */ ret = dlm_err_to_dlm_status(tmpret); @@ -638,7 +644,9 @@ retry: if (status == DLM_RECOVERING || status == DLM_MIGRATING || - status == DLM_FORWARD) { + status == DLM_FORWARD || + status == DLM_NOLOCKMGR) { + /* We want to go away for a tiny bit to allow recovery * / migration to complete on this resource. I don't * know of any wait queue we could sleep on as this @@ -650,7 +658,7 @@ retry: msleep(50); mlog(0, "retrying unlock due to pending recovery/" - "migration/in-progress\n"); + "migration/in-progress/reconnect\n"); goto retry; }