From patchwork Thu Nov 24 11:13:37 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 9445231 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 9B7CA60235 for ; Thu, 24 Nov 2016 11:14:09 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8F61627CF5 for ; Thu, 24 Nov 2016 11:14:09 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 82E0127D45; Thu, 24 Nov 2016 11:14:09 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, FREEMAIL_FROM, RCVD_IN_DNSWL_HI, T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8669827CF5 for ; Thu, 24 Nov 2016 11:14:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965032AbcKXLNv (ORCPT ); Thu, 24 Nov 2016 06:13:51 -0500 Received: from mail-io0-f195.google.com ([209.85.223.195]:33013 "EHLO mail-io0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S937225AbcKXLNj (ORCPT ); Thu, 24 Nov 2016 06:13:39 -0500 Received: by mail-io0-f195.google.com with SMTP id j92so4659294ioi.0 for ; Thu, 24 Nov 2016 03:13:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:reply-to:in-reply-to:references:from:date:message-id :subject:to:cc; bh=2oI77VTjMPS2tmiAKu952uWCJsoRe6txcKK35e439dA=; b=MGfepKUhGyxciiQhRLxEx2osYYhxYNnEYmBrZ4DYxOT40ew9iCbditPxwtBZsXitzi 7hhMp4mLmNfz4n0AvlSAkdeKGc55cCPrG4KkAihoJd0aNEL9uN2Ix+1eqgFaww+VtRRo IIqH2iDbwWEzfVqscFp6QcVuR7AZJlL62CoaJhgDgVLwtYL+SH384LE77tYK+9hHHHfq W/9yGpPKOH8TxjAV43j8HwrqVaUNu80u3Y54/DqaI6Dg+/eLiemcXytqHLPhWrN6K8hh OQopGL9dhIXxiX88mSt6N96LRhASpGQwYkZ+wG1hKywy6lo8y/XZyHKZA8eTblyf3iiU wqsA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:reply-to:in-reply-to:references :from:date:message-id:subject:to:cc; bh=2oI77VTjMPS2tmiAKu952uWCJsoRe6txcKK35e439dA=; b=l0hHdk3fRo7j/5eMczO6cCxdfbMICOBo0iKRxcNhVEWBAhVvYM9f4JbNUFnNdM6a87 DsDW1kgreHKHtyHa8ZZcBGbgBOPp3c8bvSQQZdP68Fmc5bAqtwjNE89QIXqLNdBl6GAR 8LFsvx6F9AJCTQ5WS84AuHPX94/F2aimvteGoPot7ssFp3C8SOf1Jo2yJ8CIe88dMuxg Ff8moJqBbQSwYHVUbB/NTZUKBHgzSSMgUVdXvRtR4bPqhEEQvZA1Hb66bYtdVskl3RSd IwEn6UtGyAhtQWIG3EhaLgqEFPYd2tsMmcJwbAj3roDTpZEPR8kQEB+PvWQ2hrdXudZX nH7g== X-Gm-Message-State: AKaTC01Dhqj8P9e7lzFlOsd0th4xNB3+kVrKixweNjVYHoNtbFQzj8OdbXlFxYp/pYxBzc1Xx6El68KQv/Ay4w== X-Received: by 10.36.48.196 with SMTP id q187mr1622267itq.64.1479986018232; Thu, 24 Nov 2016 03:13:38 -0800 (PST) MIME-Version: 1.0 Received: by 10.79.108.85 with HTTP; Thu, 24 Nov 2016 03:13:37 -0800 (PST) Reply-To: fdmanana@gmail.com In-Reply-To: <20161123212210.GA24103@localhost.localdomain> References: <1478287254-5458-1-git-send-email-bo.li.liu@oracle.com> <20161123212210.GA24103@localhost.localdomain> From: Filipe Manana Date: Thu, 24 Nov 2016 11:13:37 +0000 Message-ID: Subject: Re: resend: Re: Btrfs: adjust len of writes if following a preallocated extent To: Liu Bo Cc: Stefan Priebe - Profihost AG , "linux-btrfs@vger.kernel.org" , David Sterba , Josef Bacik Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Wed, Nov 23, 2016 at 9:22 PM, Liu Bo wrote: > Hi, > > On Wed, Nov 23, 2016 at 06:21:35PM +0100, Stefan Priebe - Profihost AG wrote: >> Hi, >> >> sorry last mail was from the wrong box. >> >> Am 04.11.2016 um 20:20 schrieb Liu Bo: >> > If we have >> > >> > |0--hole--4095||4096--preallocate--12287| >> > >> > instead of using preallocated space, a 8K direct write will just >> > create a new 8K extent and it'll end up with >> > >> > |0--new extent--8191||8192--preallocate--12287| >> > >> > It's because we find a hole em and then go to create a new 8K >> > extent directly without adjusting @len. >> >> after applying that one on top of my 4.4 btrfs branch (includes patches >> up to 4.10 / next). i'm getting deadlocks in btrfs. > > This is really interesting, thanks for the quick testing. > > After going through the stacks listed below, I think the patch has > exposed a bug around BTRFS_I(inode)->dio_sem: > > 1. Since fsync has acquired inode_lock(), the dio write must be > an overwrite within EOF. > > 2. Lets say the inode size is 16k and it already has a preallocated extent [4k, 8k], > then we feed it with a dio write against [0k, 8k], with this patch > applied, the write can be splitted into a new extent of [0, 4k] and a > fill-write against the preallocated one [4k, 8k], > > 3. > dio fsync > ->btrfs_direct_IO btrfs_sync_file > ->do_direct_IO > ->get_more_blocks() ->inode_lock() > ->btrfs_get_blocks_direct() # for [0, 8k] ->btrfs_log_inode() > ->btrfs_new_direct_extent() ->btrfs_log_changed_extents() > ->btrfs_create_dio_extent() > ->down_read(&BTRFS_I(inode)->dio_sem) > # dio write is splitted and > # em of [0, 4k] is inserted as well as > # the ordered extent. > ->up_read(&BTRFS_I(inode)->dio_sem) > # do_direct_IO tries to collect more pages > # before sending them down, so [0, 4k] is not > # yet submitted. > -------------------------------------------------------------------------------------------------------- > ->down_write(&BTRFS_I(inode)->dio_sem) > # found ordered extent of [0, 4k] > # wait for [0, 4k] to finish > ->get_more_blocks() > ->btrfs_get_blocks_direct() # for [4k, 8k] > ->btrfs_create_dio_extent() > -> up_read(&BTRFS_I(inode)->dio_sem) > # deadlock occurs > > 4. _Without_ this patch, we could hit the deadlock as well under space pressure, > i.e. if we request [0, 8k], but btrfs_reserve_extent() returns only [0, 4k]. > > (Filipe may correct me, cc'd Filipe.) The analysis is correct Bo. Originally to fix races between fsync and direct IO writes there was a solution [1, 2] that didn't involve adding a semaphore and relied on creating first the ordered extents and then the extent maps only in the direct IO write path (we do things in the reverse order everywhere else). It worked and was documented in comments but wasn't particularly elegant and Josef was not happy because of that, so then we added the semaphore and made direct IO write path create the extent maps and ordered extents in the same order as everywhere else [3]. So here I can only see 2 simple solutions. Either revert [3] (which added the semaphore) or acquire the semaphore at a higher level in direct IO write path like this: block_start, block_len, orig_block_len, @@ -7256,8 +7255,6 @@ static struct extent_map *btrfs_create_dio_extent(struct inode *inode, em = ERR_PTR(ret); } out: - up_read(&BTRFS_I(inode)->dio_sem); - return em; } @@ -8715,11 +8712,14 @@ static ssize_t btrfs_direct_IO(struct kiocb *iocb, struct iov_iter *iter) wakeup = false; } + if (iov_iter_rw(iter) == WRITE) + down_read(&BTRFS_I(inode)->dio_sem); ret = __blockdev_direct_IO(iocb, inode, BTRFS_I(inode)->root->fs_info->fs_devices->latest_bdev, iter, btrfs_get_blocks_direct, NULL, btrfs_submit_direct, flags); if (iov_iter_rw(iter) == WRITE) { + up_read(&BTRFS_I(inode)->dio_sem); current->journal_info = NULL; if (ret < 0 && ret != -EIOCBQUEUED) { if (dio_data.reserve) Let me know what you think. Thanks. [1] https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=de0ee0edb21fbab4c7afa3e94573ecfebfb0244e [2] https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=0b901916a00bc7b14ee83cc8e41c3b0d561a8f22 [3] https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=5f9a8a51d8b95505d8de8b7191ae2ed8c504d4af > > Thanks, > > -liubo > >> >> Traces here: >> INFO: task btrfs-transacti:604 blocked for more than 120 seconds. >> Not tainted 4.4.34 #1 >> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> btrfs-transacti D ffff8814e78cbe00 0 604 2 0x00080000 >> ffff8814e78cbe00 ffff88017367a540 ffff8814e2f88000 ffff8814e78cc000 >> ffff8814e78cbe38 ffff88123616c510 ffff8814e24c81f0 ffff88153fb0a000 >> ffff8814e78cbe18 ffffffff816a8425 ffff8814e63165a0 ffff8814e78cbe88 >> Call Trace: >> [] schedule+0x35/0x80 >> [] btrfs_commit_transaction+0x275/0xa50 [btrfs] >> [] transaction_kthread+0x1d6/0x200 [btrfs] >> [] kthread+0xdb/0x100 >> [] ret_from_fork+0x3f/0x70 >> DWARF2 unwinder stuck at ret_from_fork+0x3f/0x70 >> >> Leftover inexact backtrace: >> >> [] ? kthread_park+0x60/0x60 >> INFO: task mysqld:1977 blocked for more than 120 seconds. >> Not tainted 4.4.34 #1 >> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> mysqld D ffff88142ef1bcf8 0 1977 1 0x00080000 >> ffff88142ef1bcf8 ffffffff81e0f500 ffff8814dc2c4a80 ffff88142ef1c000 >> ffff8814e32ed298 ffff8814e32ed2c0 ffff88110aa9a000 ffff8814e32ed000 >> ffff88142ef1bd10 ffffffff816a8425 ffff8814e32ed000 ffff88142ef1bd60 >> Call Trace: >> [] schedule+0x35/0x80 >> [] wait_for_writer+0xa2/0xb0 [btrfs] >> [] btrfs_sync_log+0xe9/0xa00 [btrfs] >> [] btrfs_sync_file+0x35f/0x3d0 [btrfs] >> [] vfs_fsync_range+0x3d/0xb0 >> [] do_fsync+0x3d/0x70 >> [] SyS_fsync+0x10/0x20 >> [] entry_SYSCALL_64_fastpath+0x12/0x71 >> DWARF2 unwinder stuck at entry_SYSCALL_64_fastpath+0x12/0x71 >> >> Leftover inexact backtrace: >> >> INFO: task mysqld:3249 blocked for more than 120 seconds. >> Not tainted 4.4.34 #1 >> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> mysqld D ffff881475fdfa40 0 3249 1 0x00080000 >> ffff881475fdfa40 ffff88017367ca80 ffff8814433d2540 ffff881475fe0000 >> ffff88040da39ba0 0000000000230000 ffff88040da39c20 0000000000238000 >> ffff881475fdfa58 ffffffff816a8425 0000000000008000 ffff881475fdfb18 >> Call Trace: >> [] schedule+0x35/0x80 >> [] >> wait_ordered_extents.isra.18.constprop.23+0x147/0x3d0 [btrfs] >> [] btrfs_log_changed_extents+0x242/0x610 [btrfs] >> [] btrfs_log_inode+0x874/0xb80 [btrfs] >> [] btrfs_log_inode_parent+0x22c/0x910 [btrfs] >> [] btrfs_log_dentry_safe+0x62/0x80 [btrfs] >> [] btrfs_sync_file+0x28c/0x3d0 [btrfs] >> [] vfs_fsync_range+0x3d/0xb0 >> [] do_fsync+0x3d/0x70 >> [] SyS_fsync+0x10/0x20 >> [] entry_SYSCALL_64_fastpath+0x12/0x71 >> DWARF2 unwinder stuck at entry_SYSCALL_64_fastpath+0x12/0x71 >> >> Leftover inexact backtrace: >> >> INFO: task mysqld:3250 blocked for more than 120 seconds. >> Not tainted 4.4.34 #1 >> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> mysqld D ffff881374edb868 0 3250 1 0x00080000 >> ffff881374edb868 ffff8801736b2540 ffff8814433d4a80 ffff881374edc000 >> ffff8814e26f81c8 ffff8814e26f81e0 0000000000238000 00000000000a8000 >> ffff881374edb880 ffffffff816a8425 ffff8814433d4a80 ffff881374edb8d8 >> Call Trace: >> [] schedule+0x35/0x80 >> [] rwsem_down_read_failed+0xed/0x130 >> [] call_rwsem_down_read_failed+0x14/0x30 >> DWARF2 unwinder stuck at call_rwsem_down_read_failed+0x14/0x30 >> >> Leftover inexact backtrace: >> >> [] ? down_read+0x17/0x20 >> [] btrfs_create_dio_extent+0x46/0x1e0 [btrfs] >> [] btrfs_get_blocks_direct+0x3d8/0x730 [btrfs] >> [] ? btrfs_submit_direct+0x1ce/0x740 [btrfs] >> [] do_blockdev_direct_IO+0x11f7/0x2bc0 >> [] ? btrfs_page_exists_in_range+0xe0/0xe0 [btrfs] >> [] ? btrfs_getattr+0xa0/0xa0 [btrfs] >> [] __blockdev_direct_IO+0x43/0x50 >> [] ? btrfs_getattr+0xa0/0xa0 [btrfs] >> [] btrfs_direct_IO+0x1d1/0x380 [btrfs] >> [] ? btrfs_getattr+0xa0/0xa0 [btrfs] >> [] generic_file_direct_write+0xaa/0x170 >> [] btrfs_file_write_iter+0x2ae/0x560 [btrfs] >> [] ? futex_wake+0x81/0x150 >> [] new_sync_write+0x84/0xb0 >> [] __vfs_write+0x26/0x40 >> [] vfs_write+0xa9/0x190 >> [] ? enter_from_user_mode+0x1f/0x50 >> [] SyS_pwrite64+0x6b/0xa0 >> [] ? syscall_return_slowpath+0xb0/0x130 >> [] entry_SYSCALL_64_fastpath+0x12/0x71 >> INFO: task btrfs-transacti:604 blocked for more than 120 seconds. >> Not tainted 4.4.34 #1 >> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> btrfs-transacti D ffff8814e78cbe00 0 604 2 0x00080000 >> ffff8814e78cbe00 ffff88017367a540 ffff8814e2f88000 ffff8814e78cc000 >> ffff8814e78cbe38 ffff88123616c510 ffff8814e24c81f0 ffff88153fb0a000 >> ffff8814e78cbe18 ffffffff816a8425 ffff8814e63165a0 ffff8814e78cbe88 >> Call Trace: >> [] schedule+0x35/0x80 >> [] btrfs_commit_transaction+0x275/0xa50 [btrfs] >> [] transaction_kthread+0x1d6/0x200 [btrfs] >> [] kthread+0xdb/0x100 >> [] ret_from_fork+0x3f/0x70 >> DWARF2 unwinder stuck at ret_from_fork+0x3f/0x70 >> >> Leftover inexact backtrace: >> >> [] ? kthread_park+0x60/0x60 >> INFO: task mysqld:1977 blocked for more than 120 seconds. >> Not tainted 4.4.34 #1 >> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> mysqld D ffff88142ef1bcf8 0 1977 1 0x00080000 >> ffff88142ef1bcf8 ffffffff81e0f500 ffff8814dc2c4a80 ffff88142ef1c000 >> ffff8814e32ed298 ffff8814e32ed2c0 ffff88110aa9a000 ffff8814e32ed000 >> ffff88142ef1bd10 ffffffff816a8425 ffff8814e32ed000 ffff88142ef1bd60 >> Call Trace: >> [] schedule+0x35/0x80 >> [] wait_for_writer+0xa2/0xb0 [btrfs] >> [] btrfs_sync_log+0xe9/0xa00 [btrfs] >> [] btrfs_sync_file+0x35f/0x3d0 [btrfs] >> [] vfs_fsync_range+0x3d/0xb0 >> [] do_fsync+0x3d/0x70 >> [] SyS_fsync+0x10/0x20 >> [] entry_SYSCALL_64_fastpath+0x12/0x71 >> DWARF2 unwinder stuck at entry_SYSCALL_64_fastpath+0x12/0x71 >> >> Leftover inexact backtrace: >> >> INFO: task mysqld:3249 blocked for more than 120 seconds. >> Not tainted 4.4.34 #1 >> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> mysqld D ffff881475fdfa40 0 3249 1 0x00080000 >> ffff881475fdfa40 ffff88017367ca80 ffff8814433d2540 ffff881475fe0000 >> ffff88040da39ba0 0000000000230000 ffff88040da39c20 0000000000238000 >> ffff881475fdfa58 ffffffff816a8425 0000000000008000 ffff881475fdfb18 >> Call Trace: >> [] schedule+0x35/0x80 >> [] >> wait_ordered_extents.isra.18.constprop.23+0x147/0x3d0 [btrfs] >> [] btrfs_log_changed_extents+0x242/0x610 [btrfs] >> [] btrfs_log_inode+0x874/0xb80 [btrfs] >> [] btrfs_log_inode_parent+0x22c/0x910 [btrfs] >> [] btrfs_log_dentry_safe+0x62/0x80 [btrfs] >> [] btrfs_sync_file+0x28c/0x3d0 [btrfs] >> [] vfs_fsync_range+0x3d/0xb0 >> [] do_fsync+0x3d/0x70 >> [] SyS_fsync+0x10/0x20 >> [] entry_SYSCALL_64_fastpath+0x12/0x71 >> DWARF2 unwinder stuck at entry_SYSCALL_64_fastpath+0x12/0x71 >> >> Leftover inexact backtrace: >> >> INFO: task mysqld:3250 blocked for more than 120 seconds. >> Not tainted 4.4.34 #1 >> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> mysqld D ffff881374edb868 0 3250 1 0x00080000 >> ffff881374edb868 ffff8801736b2540 ffff8814433d4a80 ffff881374edc000 >> ffff8814e26f81c8 ffff8814e26f81e0 0000000000238000 00000000000a8000 >> ffff881374edb880 ffffffff816a8425 ffff8814433d4a80 ffff881374edb8d8 >> Call Trace: >> [] schedule+0x35/0x80 >> [] rwsem_down_read_failed+0xed/0x130 >> [] call_rwsem_down_read_failed+0x14/0x30 >> DWARF2 unwinder stuck at call_rwsem_down_read_failed+0x14/0x30 >> >> Leftover inexact backtrace: >> >> [] ? down_read+0x17/0x20 >> [] btrfs_create_dio_extent+0x46/0x1e0 [btrfs] >> [] btrfs_get_blocks_direct+0x3d8/0x730 [btrfs] >> [] ? btrfs_submit_direct+0x1ce/0x740 [btrfs] >> [] do_blockdev_direct_IO+0x11f7/0x2bc0 >> [] ? btrfs_page_exists_in_range+0xe0/0xe0 [btrfs] >> [] ? btrfs_getattr+0xa0/0xa0 [btrfs] >> [] __blockdev_direct_IO+0x43/0x50 >> [] ? btrfs_getattr+0xa0/0xa0 [btrfs] >> [] btrfs_direct_IO+0x1d1/0x380 [btrfs] >> [] ? btrfs_getattr+0xa0/0xa0 [btrfs] >> [] generic_file_direct_write+0xaa/0x170 >> [] btrfs_file_write_iter+0x2ae/0x560 [btrfs] >> [] ? futex_wake+0x81/0x150 >> [] new_sync_write+0x84/0xb0 >> [] __vfs_write+0x26/0x40 >> [] vfs_write+0xa9/0x190 >> [] ? enter_from_user_mode+0x1f/0x50 >> [] SyS_pwrite64+0x6b/0xa0 >> [] ? syscall_return_slowpath+0xb0/0x130 >> [] entry_SYSCALL_64_fastpath+0x12/0x71 >> INFO: task btrfs-transacti:604 blocked for more than 120 seconds. >> Not tainted 4.4.34 #1 >> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> btrfs-transacti D ffff8814e78cbe00 0 604 2 0x00080000 >> ffff8814e78cbe00 ffff88017367a540 ffff8814e2f88000 ffff8814e78cc000 >> ffff8814e78cbe38 ffff88123616c510 ffff8814e24c81f0 ffff88153fb0a000 >> ffff8814e78cbe18 ffffffff816a8425 ffff8814e63165a0 ffff8814e78cbe88 >> Call Trace: >> [] schedule+0x35/0x80 >> [] btrfs_commit_transaction+0x275/0xa50 [btrfs] >> [] transaction_kthread+0x1d6/0x200 [btrfs] >> [] kthread+0xdb/0x100 >> [] ret_from_fork+0x3f/0x70 >> DWARF2 unwinder stuck at ret_from_fork+0x3f/0x70 >> >> Leftover inexact backtrace: >> >> [] ? kthread_park+0x60/0x60 >> INFO: task mysqld:1977 blocked for more than 120 seconds. >> Not tainted 4.4.34 #1 >> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. >> mysqld D ffff88142ef1bcf8 0 1977 1 0x00080000 >> ffff88142ef1bcf8 ffffffff81e0f500 ffff8814dc2c4a80 ffff88142ef1c000 >> ffff8814e32ed298 ffff8814e32ed2c0 ffff88110aa9a000 ffff8814e32ed000 >> ffff88142ef1bd10 ffffffff816a8425 ffff8814e32ed000 ffff88142ef1bd60 >> Call Trace: >> [] schedule+0x35/0x80 >> [] wait_for_writer+0xa2/0xb0 [btrfs] >> [] btrfs_sync_log+0xe9/0xa00 [btrfs] >> [] btrfs_sync_file+0x35f/0x3d0 [btrfs] >> [] vfs_fsync_range+0x3d/0xb0 >> [] do_fsync+0x3d/0x70 >> [] SyS_fsync+0x10/0x20 >> [] entry_SYSCALL_64_fastpath+0x12/0x71 >> DWARF2 unwinder stuck at entry_SYSCALL_64_fastpath+0x12/0x71 >> >> Leftover inexact backtrace: >> >> Greets, >> Stefan >> >> > >> > Signed-off-by: Liu Bo >> > Reviewed-by: Chris Mason >> > --- >> > fs/btrfs/inode.c | 8 +++++--- >> > 1 file changed, 5 insertions(+), 3 deletions(-) >> > >> > diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c >> > index 2b790bd..48e9356 100644 >> > --- a/fs/btrfs/inode.c >> > +++ b/fs/btrfs/inode.c >> > @@ -7783,10 +7783,12 @@ static int btrfs_get_blocks_direct(struct inode *inode, sector_t iblock, >> > } >> > >> > /* >> > - * this will cow the extent, reset the len in case we changed >> > - * it above >> > + * this will cow the extent, if em is within [start, len], then >> > + * probably we've found a preallocated/existing extent, let's >> > + * give it a chance to use preallocated space. >> > */ >> > - len = bh_result->b_size; >> > + len = min_t(u64, bh_result->b_size, em->len - (start - em->start)); >> > + len = ALIGN(len, root->sectorsize); >> > free_extent_map(em); >> > em = btrfs_new_extent_direct(inode, start, len); >> > if (IS_ERR(em)) { >> > diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 1f980ef..b2c277d 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -7237,7 +7237,6 @@ static struct extent_map *btrfs_create_dio_extent(struct inode *inode, struct extent_map *em = NULL; int ret; - down_read(&BTRFS_I(inode)->dio_sem); if (type != BTRFS_ORDERED_NOCOW) { em = create_pinned_em(inode, start, len, orig_start,