From patchwork Fri Dec 23 09:30:18 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chandan Rajendra X-Patchwork-Id: 9487301 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id B7F3A601C0 for ; Fri, 23 Dec 2016 09:31:24 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A00C32807E for ; Fri, 23 Dec 2016 09:31:24 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 94A0A280B0; Fri, 23 Dec 2016 09:31:24 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1FB232807E for ; Fri, 23 Dec 2016 09:31:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S938570AbcLWJbI (ORCPT ); Fri, 23 Dec 2016 04:31:08 -0500 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:43038 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S935495AbcLWJag (ORCPT ); Fri, 23 Dec 2016 04:30:36 -0500 Received: from pps.filterd (m0098420.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.17/8.16.0.17) with SMTP id uBN9U1dC122409 for ; Fri, 23 Dec 2016 04:30:35 -0500 Received: from e19.ny.us.ibm.com (e19.ny.us.ibm.com [129.33.205.209]) by mx0b-001b2d01.pphosted.com with ESMTP id 27gwmkqb8j-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Fri, 23 Dec 2016 04:30:35 -0500 Received: from localhost by e19.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 23 Dec 2016 04:30:34 -0500 Received: from d01dlp02.pok.ibm.com (9.56.250.167) by e19.ny.us.ibm.com (146.89.104.206) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Fri, 23 Dec 2016 04:30:32 -0500 Received: from b01cxnp22034.gho.pok.ibm.com (b01cxnp22034.gho.pok.ibm.com [9.57.198.24]) by d01dlp02.pok.ibm.com (Postfix) with ESMTP id 8DA956E803C; Fri, 23 Dec 2016 04:30:04 -0500 (EST) Received: from b01ledav001.gho.pok.ibm.com (b01ledav001.gho.pok.ibm.com [9.57.199.106]) by b01cxnp22034.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id uBN9UVO142205308; Fri, 23 Dec 2016 09:30:31 GMT Received: from b01ledav001.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 825032803F; Fri, 23 Dec 2016 04:30:31 -0500 (EST) Received: from localhost.localdomain.in.ibm.com (unknown [9.85.73.229]) by b01ledav001.gho.pok.ibm.com (Postfix) with ESMTP id B087C28046; Fri, 23 Dec 2016 04:30:29 -0500 (EST) From: Chandan Rajendra To: linux-btrfs@vger.kernel.org Cc: Chandan Rajendra , fdmanana@suse.com, dsterba@suse.com Subject: [PATCH] Btrfs: Fix deadlock between direct IO and fast fsync Date: Fri, 23 Dec 2016 15:00:18 +0530 X-Mailer: git-send-email 2.5.5 X-TM-AS-GCONF: 00 X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16122309-0056-0000-0000-0000024AE5E0 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00006301; HX=3.00000240; KW=3.00000007; PH=3.00000004; SC=3.00000198; SDB=6.00797847; UDB=6.00387380; IPR=6.00575653; BA=6.00005001; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00013695; XFM=3.00000011; UTC=2016-12-23 09:30:33 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 16122309-0057-0000-0000-0000067FEE18 Message-Id: <1482485418-4190-1-git-send-email-chandan@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2016-12-23_07:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=1 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1612050000 definitions=main-1612230161 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP The following deadlock is seen when executing generic/113 test, ---------------------------------------------------------+---------------------------------------------------- Direct I/O task Fast fsync task ---------------------------------------------------------+---------------------------------------------------- btrfs_direct_IO __blockdev_direct_IO do_blockdev_direct_IO do_direct_IO btrfs_get_blocks_direct while (blocks needs to written) get_more_blocks (first iteration) btrfs_get_blocks_direct btrfs_create_dio_extent down_read(&BTRFS_I(inode) >dio_sem) Create and add extent map and ordered extent up_read(&BTRFS_I(inode) >dio_sem) btrfs_sync_file btrfs_log_dentry_safe btrfs_log_inode_parent btrfs_log_inode btrfs_log_changed_extents down_write(&BTRFS_I(inode) >dio_sem) Collect new extent maps and ordered extents wait for ordered extent completion get_more_blocks (second iteration) btrfs_get_blocks_direct btrfs_create_dio_extent down_read(&BTRFS_I(inode) >dio_sem) -------------------------------------------------------------------------------------------------------------- In the above description, Btrfs direct I/O code path has not yet started submitting bios for file range covered by the initial ordered extent. Meanwhile, The fast fsync task obtains the write semaphore and waits for I/O on the ordered extent to get completed. However, the Direct I/O task is now blocked on obtaining the read semaphore. To resolve the deadlock, this commit modifies the Direct I/O code path to obtain the read semaphore before invoking __blockdev_direct_IO(). The semaphore is then given up after __blockdev_direct_IO() returns. This allows the Direct I/O code to complete I/O on all the ordered extents it creates. Signed-off-by: Chandan Rajendra Reviewed-by: Filipe Manana --- fs/btrfs/inode.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 5ca88f0..f796037 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -7325,7 +7325,6 @@ static struct extent_map *btrfs_create_dio_extent(struct inode *inode, struct extent_map *em = NULL; int ret; - down_read(&BTRFS_I(inode)->dio_sem); if (type != BTRFS_ORDERED_NOCOW) { em = create_pinned_em(inode, start, len, orig_start, block_start, block_len, orig_block_len, @@ -7344,7 +7343,6 @@ static struct extent_map *btrfs_create_dio_extent(struct inode *inode, em = ERR_PTR(ret); } out: - up_read(&BTRFS_I(inode)->dio_sem); return em; } @@ -8800,6 +8798,7 @@ static ssize_t btrfs_direct_IO(struct kiocb *iocb, struct iov_iter *iter) dio_data.unsubmitted_oe_range_start = (u64)offset; dio_data.unsubmitted_oe_range_end = (u64)offset; current->journal_info = &dio_data; + down_read(&BTRFS_I(inode)->dio_sem); } else if (test_bit(BTRFS_INODE_READDIO_NEED_LOCK, &BTRFS_I(inode)->runtime_flags)) { inode_dio_end(inode); @@ -8812,6 +8811,7 @@ static ssize_t btrfs_direct_IO(struct kiocb *iocb, struct iov_iter *iter) iter, btrfs_get_blocks_direct, NULL, btrfs_submit_direct, flags); if (iov_iter_rw(iter) == WRITE) { + up_read(&BTRFS_I(inode)->dio_sem); current->journal_info = NULL; if (ret < 0 && ret != -EIOCBQUEUED) { if (dio_data.reserve)