From patchwork Wed Oct 19 05:19:41 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhen Ren X-Patchwork-Id: 9383601 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 5AC6C600CA for ; Wed, 19 Oct 2016 05:21:28 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4EFED28FF7 for ; Wed, 19 Oct 2016 05:21:28 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 43F6E296E6; Wed, 19 Oct 2016 05:21:28 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from aserp1040.oracle.com (aserp1040.oracle.com [141.146.126.69]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 76B0728FF7 for ; Wed, 19 Oct 2016 05:21:27 +0000 (UTC) Received: from userv0021.oracle.com (userv0021.oracle.com [156.151.31.71]) by aserp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id u9J5LF25008172 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 19 Oct 2016 05:21:16 GMT Received: from oss.oracle.com (oss-old-reserved.oracle.com [137.254.22.2]) by userv0021.oracle.com (8.14.4/8.14.4) with ESMTP id u9J5LFZg026855 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 19 Oct 2016 05:21:15 GMT Received: from localhost ([127.0.0.1] helo=lb-oss.oracle.com) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1bwjJT-00032O-0t; Tue, 18 Oct 2016 22:21:15 -0700 Received: from userv0022.oracle.com ([156.151.31.74]) by oss.oracle.com with esmtp (Exim 4.63) (envelope-from ) id 1bwjId-00030b-Mf for ocfs2-devel@oss.oracle.com; Tue, 18 Oct 2016 22:20:23 -0700 Received: from aserp1030.oracle.com (aserp1030.oracle.com [141.146.126.68]) by userv0022.oracle.com (8.14.4/8.14.4) with ESMTP id u9J5KMh5025862 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Wed, 19 Oct 2016 05:20:22 GMT Received: from userp2030.oracle.com (userp2030.oracle.com [156.151.31.89]) by aserp1030.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id u9J5KLkk031575 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO) for ; Wed, 19 Oct 2016 05:20:22 GMT Received: from pps.filterd (userp2030.oracle.com [127.0.0.1]) by userp2030.oracle.com (8.16.0.17/8.16.0.17) with SMTP id u9J5HOw2028520 for ; Wed, 19 Oct 2016 05:20:21 GMT Authentication-Results: oracle.com; spf=pass smtp.mailfrom=zren@suse.com Received: from prv3-mh.provo.novell.com (victor.provo.novell.com [137.65.250.26]) by userp2030.oracle.com with ESMTP id 265ycthyr5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Wed, 19 Oct 2016 05:20:21 +0000 Received: from laptop.apac.novell.com (prv-ext-foundry1int.gns.novell.com [137.65.251.240]) by prv3-mh.provo.novell.com with ESMTP (TLS encrypted); Tue, 18 Oct 2016 23:20:04 -0600 From: Eric Ren To: ocfs2-devel@oss.oracle.com Date: Wed, 19 Oct 2016 13:19:41 +0800 Message-Id: <1476854382-28101-2-git-send-email-zren@suse.com> X-Mailer: git-send-email 2.6.6 In-Reply-To: <1476854382-28101-1-git-send-email-zren@suse.com> References: <1476854382-28101-1-git-send-email-zren@suse.com> X-ServerName: victor.provo.novell.com X-Proofpoint-SPF-Result: pass X-Proofpoint-SPF-Record: v=spf1 include:microfocus.com ~all X-Proofpoint-Virus-Version: vendor=nai engine=5800 definitions=8322 signatures=670725 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 suspectscore=3 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1609300000 definitions=main-1610190093 Subject: [Ocfs2-devel] [DRAFT 1/2] ocfs2/dlmglue: keep track of the processes who take/put a cluster lock X-BeenThere: ocfs2-devel@oss.oracle.com X-Mailman-Version: 2.1.9 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: ocfs2-devel-bounces@oss.oracle.com Errors-To: ocfs2-devel-bounces@oss.oracle.com X-Source-IP: userv0021.oracle.com [156.151.31.71] X-Virus-Scanned: ClamAV using ClamSMTP We are in the situation that we have to avoid recursive cluster locking, but there is no way to check if a cluster lock has been taken by a precess already. Mostly, we can avoid recursive locking by writing code carefully. However, we found that it's very hard to handle the routines that are invoked directly by vfs. For instance: const struct inode_operations ocfs2_file_iops = { .permission = ocfs2_permission, .get_acl = ocfs2_iop_get_acl, .set_acl = ocfs2_iop_set_acl, }; ocfs2_permission() and ocfs2_iop_get/set_acl() both call ocfs2_inode_lock(). The problem is that the call chain of ocfs2_permission() includes *_acl(). Possibly, there are two solutions I can think of. One way is to implement the inode permission routine for ocfs2 itself, replacing the existing generic_permission(); this will bring lots of changes and involve too many trivial vfs functions into ocfs2 code. Another way is to keep track of the processes who lock/unlock a cluster lock. This patch provides ocfs2_is_locked_by_me() for process to check if the cluster lock has been requested before. This is now only used for avoiding recursive locking, though it also can help debug cluster locking issue. Unfortunately, this may incur some performance lost. Signed-off-by: Eric Ren --- fs/ocfs2/dlmglue.c | 60 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ fs/ocfs2/dlmglue.h | 13 ++++++++++++ fs/ocfs2/ocfs2.h | 1 + 3 files changed, 74 insertions(+) diff --git a/fs/ocfs2/dlmglue.c b/fs/ocfs2/dlmglue.c index 83d576f..9f91884 100644 --- a/fs/ocfs2/dlmglue.c +++ b/fs/ocfs2/dlmglue.c @@ -532,6 +532,7 @@ void ocfs2_lock_res_init_once(struct ocfs2_lock_res *res) init_waitqueue_head(&res->l_event); INIT_LIST_HEAD(&res->l_blocked_list); INIT_LIST_HEAD(&res->l_mask_waiters); + INIT_LIST_HEAD(&res->l_holders); } void ocfs2_inode_lock_res_init(struct ocfs2_lock_res *res, @@ -749,6 +750,48 @@ void ocfs2_lock_res_free(struct ocfs2_lock_res *res) res->l_flags = 0UL; } +static inline void ocfs2_add_holder(struct ocfs2_lock_res *lockres) +{ + struct ocfs2_holder *oh = kmalloc(sizeof(struct ocfs2_holder), GFP_KERNEL); + + INIT_LIST_HEAD(&oh->oh_list); + oh->oh_lockres = lockres; + oh->oh_owner_pid = get_pid(task_pid(current)); + + spin_lock(&lockres->l_lock); + list_add_tail(&oh->oh_list, &lockres->l_holders); + spin_unlock(&lockres->l_lock); +} + +static inline void ocfs2_remove_holder(struct ocfs2_lock_res *lockres, + struct ocfs2_holder *oh) +{ + spin_lock(&lockres->l_lock); + list_del(&oh->oh_list); + spin_unlock(&lockres->l_lock); + + put_pid(oh->oh_owner_pid); + kfree(oh); +} + +struct ocfs2_holder *ocfs2_is_locked_by_me(struct ocfs2_lock_res *lockres) +{ + struct ocfs2_holder *oh; + struct pid *pid; + + spin_lock(&lockres->l_lock); + pid = task_pid(current); + list_for_each_entry(oh, &lockres->l_holders, oh_list) { + if (oh->oh_owner_pid == pid) + goto out; + } + oh = NULL; +out: + spin_unlock(&lockres->l_lock); + + return oh; +} + static inline void ocfs2_inc_holders(struct ocfs2_lock_res *lockres, int level) { @@ -1392,6 +1435,7 @@ static int __ocfs2_cluster_lock(struct ocfs2_super *osb, int noqueue_attempted = 0; int dlm_locked = 0; int kick_dc = 0; + struct ocfs2_holder *oh; if (!(lockres->l_flags & OCFS2_LOCK_INITIALIZED)) { mlog_errno(-EINVAL); @@ -1403,6 +1447,14 @@ static int __ocfs2_cluster_lock(struct ocfs2_super *osb, if (lockres->l_ops->flags & LOCK_TYPE_USES_LVB) lkm_flags |= DLM_LKF_VALBLK; + /* This block is just used to check recursive locking now */ + oh = ocfs2_is_locked_by_me(lockres); + if (unlikely(oh)) + mlog_bug_on_msg(1, "PID(%d) locks on lockres(%s) recursively\n", + pid_nr(oh->oh_owner_pid), lockres->l_name); + else + ocfs2_add_holder(lockres); + again: wait = 0; @@ -1596,6 +1648,14 @@ static void __ocfs2_cluster_unlock(struct ocfs2_super *osb, unsigned long caller_ip) { unsigned long flags; + struct ocfs2_holder *oh = ocfs2_is_locked_by_me(lockres); + + /* This block is just used to check recursive locking now */ + if (unlikely(!oh)) + mlog_bug_on_msg(1, "PID(%d) unlock lockres(%s) unnecessarily\n", + pid_nr(task_pid(current)), lockres->l_name); + else + ocfs2_remove_holder(lockres, oh); spin_lock_irqsave(&lockres->l_lock, flags); ocfs2_dec_holders(lockres, level); diff --git a/fs/ocfs2/dlmglue.h b/fs/ocfs2/dlmglue.h index d293a22..3b1d4e7 100644 --- a/fs/ocfs2/dlmglue.h +++ b/fs/ocfs2/dlmglue.h @@ -70,6 +70,13 @@ struct ocfs2_orphan_scan_lvb { __be32 lvb_os_seqno; }; +struct ocfs2_holder { + struct list_head oh_list; + + struct ocfs2_lock_res *oh_lockres; + struct pid *oh_owner_pid; +}; + /* ocfs2_inode_lock_full() 'arg_flags' flags */ /* don't wait on recovery. */ #define OCFS2_META_LOCK_RECOVERY (0x01) @@ -170,4 +177,10 @@ void ocfs2_put_dlm_debug(struct ocfs2_dlm_debug *dlm_debug); /* To set the locking protocol on module initialization */ void ocfs2_set_locking_protocol(void); + +/* + * Keep a list of processes that have interest in a lockres. + * Note: this is now only uesed for check recursive cluster lock. + */ +struct ocfs2_holder *ocfs2_is_locked_by_me(struct ocfs2_lock_res *lockres); #endif /* DLMGLUE_H */ diff --git a/fs/ocfs2/ocfs2.h b/fs/ocfs2/ocfs2.h index e63af7d..12933b8 100644 --- a/fs/ocfs2/ocfs2.h +++ b/fs/ocfs2/ocfs2.h @@ -172,6 +172,7 @@ struct ocfs2_lock_res { struct list_head l_blocked_list; struct list_head l_mask_waiters; + struct list_head l_holders; unsigned long l_flags; char l_name[OCFS2_LOCK_ID_MAX_LEN];