From patchwork Thu Dec 3 08:50:03 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Junxiao Bi X-Patchwork-Id: 7757421 Return-Path: X-Original-To: patchwork-ocfs2-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 3510CBEEE1 for ; Thu, 3 Dec 2015 08:52:54 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 49C432051C for ; Thu, 3 Dec 2015 08:52:53 +0000 (UTC) Received: from aserp1040.oracle.com (aserp1040.oracle.com [141.146.126.69]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 386202041C for ; Thu, 3 Dec 2015 08:52:52 +0000 (UTC) Received: from aserv0022.oracle.com (aserv0022.oracle.com [141.146.126.234]) by aserp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id tB38pPRU020574 (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Thu, 3 Dec 2015 08:51:26 GMT Received: from oss.oracle.com (oss-old-reserved.oracle.com [137.254.22.2]) by aserv0022.oracle.com (8.13.8/8.13.8) with ESMTP id tB38pPZ5010377 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 3 Dec 2015 08:51:25 GMT Received: from localhost ([127.0.0.1] helo=lb-oss.oracle.com) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1a4Pbp-0004rT-51; Thu, 03 Dec 2015 00:51:25 -0800 Received: from userv0021.oracle.com ([156.151.31.71]) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1a4PbR-0004k2-TX for ocfs2-devel@oss.oracle.com; Thu, 03 Dec 2015 00:51:02 -0800 Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by userv0021.oracle.com (8.13.8/8.13.8) with ESMTP id tB38p1cA031971 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL); Thu, 3 Dec 2015 08:51:01 GMT Received: from abhmp0010.oracle.com (abhmp0010.oracle.com [141.146.116.16]) by aserv0121.oracle.com (8.13.8/8.13.8) with ESMTP id tB38p04c001792; Thu, 3 Dec 2015 08:51:01 GMT Received: from bijx-OptiPlex-780.cn.oracle.com (/10.182.32.177) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Thu, 03 Dec 2015 00:51:00 -0800 From: Junxiao Bi To: ocfs2-devel@oss.oracle.com Date: Thu, 3 Dec 2015 16:50:03 +0800 Message-Id: <1449132603-4918-1-git-send-email-junxiao.bi@oracle.com> X-Mailer: git-send-email 1.7.9.5 Cc: mfasheh@suse.com Subject: [Ocfs2-devel] [PATCH] ocfs2: dlm: fix deadlock due to nested lock X-BeenThere: ocfs2-devel@oss.oracle.com X-Mailman-Version: 2.1.9 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: ocfs2-devel-bounces@oss.oracle.com Errors-To: ocfs2-devel-bounces@oss.oracle.com X-Source-IP: aserv0022.oracle.com [141.146.126.234] X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP DLM allows nested cluster locking. One node X can acquire a cluster lock two times before release it. But between these two acquiring, if another node Y asks for the same lock and is blocked, then a bast will be sent to node X and OCFS2_LOCK_BLOCKED will be set in that lock's lockres. In this case, the second acquiring of that lock in node X will cause a deadlock. Not block for nested locking can fix this. Use ocfs2-test multiple reflink test can reproduce this on v4.3 kernel, the whole cluster hung, and get the following call trace. INFO: task multi_reflink_t:10118 blocked for more than 120 seconds. Tainted: G OE 4.3.0 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. multi_reflink_t D ffff88003e816980 0 10118 10117 0x00000080 ffff880005b735f8 0000000000000082 ffffffff81a25500 ffff88003e750000 ffff880005b735c8 ffffffff8117992f ffffea0000929f80 ffff88003e816980 7fffffffffffffff 0000000000000000 0000000000000001 ffffea0000929f80 Call Trace: [] ? find_get_entry+0x2f/0xc0 [] schedule+0x3e/0x80 [] schedule_timeout+0x1c8/0x220 [] ? ocfs2_inode_cache_unlock+0x14/0x20 [ocfs2] [] ? ocfs2_metadata_cache_unlock+0x19/0x30 [ocfs2] [] ? ocfs2_buffer_cached+0x99/0x170 [ocfs2] [] ? ocfs2_inode_cache_unlock+0x14/0x20 [ocfs2] [] ? ocfs2_metadata_cache_unlock+0x19/0x30 [ocfs2] [] ? __raw_callee_save___pv_queued_spin_unlock+0x11/0x20 [] wait_for_completion+0xde/0x110 [] ? try_to_wake_up+0x240/0x240 [] __ocfs2_cluster_lock+0x20d/0x720 [ocfs2] [] ? __raw_callee_save___pv_queued_spin_unlock+0x11/0x20 [] ocfs2_inode_lock_full_nested+0x181/0x400 [ocfs2] [] ? ocfs2_iop_get_acl+0x53/0x113 [ocfs2] [] ? igrab+0x42/0x70 [] ocfs2_iop_get_acl+0x53/0x113 [ocfs2] [] get_acl+0x53/0x70 [] posix_acl_create+0x73/0x130 [] ocfs2_mknod+0x7cf/0x1140 [ocfs2] [] ocfs2_create+0x62/0x110 [ocfs2] [] ? __d_alloc+0x65/0x190 [] ? __inode_permission+0x4e/0xd0 [] vfs_create+0xd5/0x100 [] ? lookup_real+0x1d/0x60 [] lookup_open+0x173/0x1a0 [] ? percpu_down_read+0x16/0x70 [] do_last+0x31a/0x830 [] ? __inode_permission+0x4e/0xd0 [] ? inode_permission+0x18/0x50 [] ? link_path_walk+0x290/0x550 [] path_openat+0x7c/0x140 [] do_filp_open+0x85/0xe0 [] ? getname_flags+0x7f/0x1f0 [] do_sys_open+0x11a/0x220 [] ? syscall_trace_enter_phase1+0x15b/0x170 [] SyS_open+0x1e/0x20 [] entry_SYSCALL_64_fastpath+0x12/0x71 commit 743b5f1434f5 ("ocfs2: take inode lock in ocfs2_iop_set/get_acl()") add a nested locking to ocfs2_mknod() which exports this deadlock, but indeed this is a common issue, it can be triggered in other place. Cc: Signed-off-by: Junxiao Bi --- fs/ocfs2/dlmglue.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/fs/ocfs2/dlmglue.c b/fs/ocfs2/dlmglue.c index 1c91103..5b7d9d4 100644 --- a/fs/ocfs2/dlmglue.c +++ b/fs/ocfs2/dlmglue.c @@ -1295,7 +1295,9 @@ static inline int ocfs2_may_continue_on_blocked_lock(struct ocfs2_lock_res *lock { BUG_ON(!(lockres->l_flags & OCFS2_LOCK_BLOCKED)); - return wanted <= ocfs2_highest_compat_lock_level(lockres->l_blocking); + /* allow nested lock request go to avoid deadlock. */ + return wanted <= ocfs2_highest_compat_lock_level(lockres->l_blocking) + || lockres->l_ro_holders || lockres->l_ex_holders; } static void ocfs2_init_mask_waiter(struct ocfs2_mask_waiter *mw)