From patchwork Fri Feb 22 02:08:19 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Yan, Zheng" X-Patchwork-Id: 2174251 Return-Path: X-Original-To: patchwork-ceph-devel@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork1.kernel.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by patchwork1.kernel.org (Postfix) with ESMTP id 39C1D3FE37 for ; Fri, 22 Feb 2013 02:08:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754649Ab3BVCIb (ORCPT ); Thu, 21 Feb 2013 21:08:31 -0500 Received: from mga14.intel.com ([143.182.124.37]:37776 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754728Ab3BVCI2 (ORCPT ); Thu, 21 Feb 2013 21:08:28 -0500 Received: from azsmga001.ch.intel.com ([10.2.17.19]) by azsmga102.ch.intel.com with ESMTP; 21 Feb 2013 18:08:28 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.84,713,1355126400"; d="scan'208";a="260073821" Received: from zyan5-mobl.sh.intel.com ([10.239.36.43]) by azsmga001.ch.intel.com with ESMTP; 21 Feb 2013 18:08:26 -0800 From: "Yan, Zheng" To: ceph-devel@vger.kernel.org, sage@inktank.com Cc: "Yan, Zheng" Subject: [PATCH 3/3] ceph: fix vmtruncate deadlock Date: Fri, 22 Feb 2013 10:08:19 +0800 Message-Id: <1361498899-15831-4-git-send-email-zheng.z.yan@intel.com> X-Mailer: git-send-email 1.7.11.7 In-Reply-To: <1361498899-15831-1-git-send-email-zheng.z.yan@intel.com> References: <1361498899-15831-1-git-send-email-zheng.z.yan@intel.com> Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org From: "Yan, Zheng" If there is pending truncation, ceph_get_caps() waits for a work thread to apply it. To apply pending truncation, the work thread needs acquire i_mutex. But in the buffered write case, ceph_get_caps() is called while i_mutex is locked. So the writer and the work thread wait for each other The fix is make ceph_get_caps() not wait, apply pending truncation in ceph_aio_write() instead. Signed-off-by: Yan, Zheng --- fs/ceph/caps.c | 6 ------ fs/ceph/file.c | 15 ++++++++++----- 2 files changed, 10 insertions(+), 11 deletions(-) diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index 5d5c32b..b9d8417 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -2067,12 +2067,6 @@ static int try_get_cap_refs(struct ceph_inode_info *ci, int need, int want, } have = __ceph_caps_issued(ci, &implemented); - /* - * disallow writes while a truncate is pending - */ - if (ci->i_truncate_pending) - have &= ~CEPH_CAP_FILE_WR; - if ((have & need) == need) { /* * Look at (implemented & ~have & not) so that we keep waiting diff --git a/fs/ceph/file.c b/fs/ceph/file.c index a1e5b81..bf7849a 100644 --- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -653,7 +653,6 @@ static ssize_t ceph_aio_read(struct kiocb *iocb, const struct iovec *iov, dout("aio_read %p %llx.%llx %llu~%u trying to get caps on %p\n", inode, ceph_vinop(inode), pos, (unsigned)len, inode); again: - __ceph_do_pending_vmtruncate(inode); if (fi->fmode & CEPH_FILE_MODE_LAZY) want = CEPH_CAP_FILE_CACHE | CEPH_CAP_FILE_LAZYIO; else @@ -724,11 +723,13 @@ static ssize_t ceph_aio_write(struct kiocb *iocb, const struct iovec *iov, if (ceph_snap(inode) != CEPH_NOSNAP) return -EROFS; + sb_start_write(inode->i_sb); retry_snap: written = 0; - if (ceph_osdmap_flag(osdc->osdmap, CEPH_OSDMAP_FULL)) - return -ENOSPC; - __ceph_do_pending_vmtruncate(inode); + if (ceph_osdmap_flag(osdc->osdmap, CEPH_OSDMAP_FULL)) { + ret = -ENOSPC; + goto out; + } /* * try to do a buffered write. if we don't have sufficient @@ -738,7 +739,10 @@ retry_snap: if (!(iocb->ki_filp->f_flags & O_DIRECT) && !(inode->i_sb->s_flags & MS_SYNCHRONOUS) && !(fi->flags & CEPH_F_SYNC)) { - ret = generic_file_aio_write(iocb, iov, nr_segs, pos); + mutex_lock(&inode->i_mutex); + __ceph_do_pending_vmtruncate(inode); + ret = __generic_file_aio_write(iocb, iov, nr_segs, &iocb->ki_pos); + mutex_unlock(&inode->i_mutex); if (ret >= 0) written = ret; @@ -783,6 +787,7 @@ out: inode, ceph_vinop(inode), pos, (unsigned)iov->iov_len); goto retry_snap; } + sb_end_write(inode->i_sb); return ret; }