From patchwork Thu Jun 16 15:45:39 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naohiro Aota X-Patchwork-Id: 12884389 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6554DC43334 for ; Thu, 16 Jun 2022 15:45:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1376372AbiFPPpy (ORCPT ); Thu, 16 Jun 2022 11:45:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52926 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237882AbiFPPpx (ORCPT ); Thu, 16 Jun 2022 11:45:53 -0400 Received: from esa6.hgst.iphmx.com (esa6.hgst.iphmx.com [216.71.154.45]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4806735AA9 for ; Thu, 16 Jun 2022 08:45:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1655394353; x=1686930353; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=3TYw9jGpjHN08B7Biro2kQl4+7DqbC2atuKqsArnnPU=; b=B4XybLiCqdE9ZGCp7jNJ4sSQX+OXwibd8lrvHCcz2S708SRk5l+k0RSk fmXBTawzFFJGIus6mMoHeyU5oa4lVIt0hTnlNXVjrXLc/uArZ/dZi8q9J iHcsC9GZWv+f/vTrQNndeD5roW04QcBChAOt2a3EsugNB7yfDrrtVrZTP Y9wd9UjLZE5EmRppXHDSAZuuIxEEQpyvjclEBPLHaPa9gG4Ls/NmplbJJ wNE8KwzZ8WkSxyd4aMEmzbZobZR1O1B55zod5GrXMxQOwSRXSHQQbjQ60 ax9jM/rHo7KRjIqNhSSRkiRppHfI0aD15ClIcSm/6m0S2+JvTJ6mhiygb w==; X-IronPort-AV: E=Sophos;i="5.92,305,1650902400"; d="scan'208";a="204103911" Received: from h199-255-45-15.hgst.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 16 Jun 2022 23:45:52 +0800 IronPort-SDR: smRf153gsiqWq/fAVwy/gvoRFZbSCsCLMVnZ4jBH6BUhN3G481mEigmsnPdURzzh3XJlZz9IYz KtDxK5w79L8Tkh50NW9lpCLZxvSCkasV48mehtiOxTb6R8s8CFfriR4tvcCnJFvNFsMgwFmDR8 QQhLvll5/r5HAeGb0tlcmI1mhNEnSu8jBwNGyri6p2CQtV/+TgGH9I/nTsI7WRiVXU9RY/AjlO 1RNGsq5fZxPzmtHnD2Nmj6R5gqw3rGKPYR7iTcUsKq0j1F3VMGVWeEbB/PPLl9yrO/x6VkwD4s O+kHDfgGNseK2BC1A91A8EUm Received: from uls-op-cesaip01.wdc.com ([10.248.3.36]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 16 Jun 2022 08:04:02 -0700 IronPort-SDR: M/x26jr66bC0KzEJTmH3SU+tU9B1hwSU0EZfLzlQ9RCEB4C5yZ92/zoGTGuc+aAphnUGJrvt77 32Kw9dtIgSa1mfqrj4I3U2iQT2vnHyrErmRKAJ9frNl6ug+wZEQVLiT0M/vXeC+CSHiHHmegs1 r3bUeSoWX+s0KGwCxfLbnzsY6FstF1IMz7Nwfxo0+igCFAm/sfI4KOOvUJ5W6gd0+hvD45KeKN BU6lAYTBhCVD2FSrUPvgyEbAC262rmumtMtDI5BvuTtQiji/IzFnBJwKVuzSY1yvIbYEJYJ7wz dsU= WDCIronportException: Internal Received: from jpf010151.ad.shared (HELO naota-xeon.wdc.com) ([10.225.50.117]) by uls-op-cesaip01.wdc.com with ESMTP; 16 Jun 2022 08:45:50 -0700 From: Naohiro Aota To: linux-btrfs@vger.kernel.org Cc: Naohiro Aota Subject: [PATCH 1/4] btrfs: ensure pages are unlocked on cow_file_range() failure Date: Fri, 17 Jun 2022 00:45:39 +0900 Message-Id: <318b80987f74e1cf6bf4ab09aed2399538fc4f9e.1655391633.git.naohiro.aota@wdc.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org There is a hung_task report on zoned btrfs like below. https://github.com/naota/linux/issues/59 [ 726.328648] INFO: task rocksdb:high0:11085 blocked for more than 241 seconds. [ 726.329839] Not tainted 5.16.0-rc1+ #1 [ 726.330484] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 726.331603] task:rocksdb:high0 state:D stack: 0 pid:11085 ppid: 11082 flags:0x00000000 [ 726.331608] Call Trace: [ 726.331611] [ 726.331614] __schedule+0x2e5/0x9d0 [ 726.331622] schedule+0x58/0xd0 [ 726.331626] io_schedule+0x3f/0x70 [ 726.331629] __folio_lock+0x125/0x200 [ 726.331634] ? find_get_entries+0x1bc/0x240 [ 726.331638] ? filemap_invalidate_unlock_two+0x40/0x40 [ 726.331642] truncate_inode_pages_range+0x5b2/0x770 [ 726.331649] truncate_inode_pages_final+0x44/0x50 [ 726.331653] btrfs_evict_inode+0x67/0x480 [ 726.331658] evict+0xd0/0x180 [ 726.331661] iput+0x13f/0x200 [ 726.331664] do_unlinkat+0x1c0/0x2b0 [ 726.331668] __x64_sys_unlink+0x23/0x30 [ 726.331670] do_syscall_64+0x3b/0xc0 [ 726.331674] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 726.331677] RIP: 0033:0x7fb9490a171b [ 726.331681] RSP: 002b:00007fb943ffac68 EFLAGS: 00000246 ORIG_RAX: 0000000000000057 [ 726.331684] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fb9490a171b [ 726.331686] RDX: 00007fb943ffb040 RSI: 000055a6bbe6ec20 RDI: 00007fb94400d300 [ 726.331687] RBP: 00007fb943ffad00 R08: 0000000000000000 R09: 0000000000000000 [ 726.331688] R10: 0000000000000031 R11: 0000000000000246 R12: 00007fb943ffb000 [ 726.331690] R13: 00007fb943ffb040 R14: 0000000000000000 R15: 00007fb943ffd260 [ 726.331693] While we debug the issue, we found running fstests generic/551 on 5GB non-zoned null_blk device in the emulated zoned mode also had a similar hung issue. Also, we can reproduce the same symptom with an error injected cow_file_range() setup. The hang occurs when cow_file_range() fails in the middle of allocation. cow_file_range() called from do_allocation_zoned() can split the give region ([start, end]) for allocation depending on current block group usages. When btrfs can allocate bytes for one part of the split regions but fails for the other region (e.g. because of -ENOSPC), we return the error leaving the pages in the succeeded regions locked. Technically, this occurs only when @unlock == 0. Otherwise, we unlock the pages in an allocated region after creating an ordered extent. Considering the callers of cow_file_range(unlock=0) won't write out the pages, we can unlock the pages on error exit from cow_file_range(). So, we can ensure all the pages except @locked_page are unlocked on error case. In summary, cow_file_range now behaves like this: - page_started == 1 (return value) - All the pages are unlocked. IO is started. - unlock == 1 - All the pages except @locked_page are unlocked in any case - unlock == 0 - On success, all the pages are locked for writing out them - On failure, all the pages except @locked_page are unlocked Fixes: 42c011000963 ("btrfs: zoned: introduce dedicated data write path for zoned filesystems") CC: stable@vger.kernel.org # 5.12+ Signed-off-by: Naohiro Aota --- fs/btrfs/inode.c | 72 ++++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 64 insertions(+), 8 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 1247690e7021..0c3d9998470f 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -1134,6 +1134,28 @@ static u64 get_extent_allocation_hint(struct btrfs_inode *inode, u64 start, * *page_started is set to one if we unlock locked_page and do everything * required to start IO on it. It may be clean and already done with * IO when we return. + * + * When unlock == 1, we unlock the pages in successfully allocated regions. + * When unlock == 0, we leave them locked for writing them out. + * + * However, we unlock all the pages except @locked_page in case of failure. + * + * In summary, page locking state will be as follow: + * + * - page_started == 1 (return value) + * - All the pages are unlocked. IO is started. + * - Note that this can happen only on success + * - unlock == 1 + * - All the pages except @locked_page are unlocked in any case + * - unlock == 0 + * - On success, all the pages are locked for writing out them + * - On failure, all the pages except @locked_page are unlocked + * + * When a failure happens in the second or later iteration of the + * while-loop, the ordered extents created in previous iterations are kept + * intact. So, the caller must clean them up by calling + * btrfs_cleanup_ordered_extents(). See btrfs_run_delalloc_range() for + * example. */ static noinline int cow_file_range(struct btrfs_inode *inode, struct page *locked_page, @@ -1143,6 +1165,7 @@ static noinline int cow_file_range(struct btrfs_inode *inode, struct btrfs_root *root = inode->root; struct btrfs_fs_info *fs_info = root->fs_info; u64 alloc_hint = 0; + u64 orig_start = start; u64 num_bytes; unsigned long ram_size; u64 cur_alloc_size = 0; @@ -1336,18 +1359,44 @@ static noinline int cow_file_range(struct btrfs_inode *inode, btrfs_dec_block_group_reservations(fs_info, ins.objectid); btrfs_free_reserved_extent(fs_info, ins.objectid, ins.offset, 1); out_unlock: + /* + * Now, we have three regions to clean up, as shown below. + * + * |-------(1)----|---(2)---|-------------(3)----------| + * `- orig_start `- start `- start + cur_alloc_size `- end + * + * We process each region below. + */ + clear_bits = EXTENT_LOCKED | EXTENT_DELALLOC | EXTENT_DELALLOC_NEW | EXTENT_DEFRAG | EXTENT_CLEAR_META_RESV; page_ops = PAGE_UNLOCK | PAGE_START_WRITEBACK | PAGE_END_WRITEBACK; + /* - * If we reserved an extent for our delalloc range (or a subrange) and - * failed to create the respective ordered extent, then it means that - * when we reserved the extent we decremented the extent's size from - * the data space_info's bytes_may_use counter and incremented the - * space_info's bytes_reserved counter by the same amount. We must make - * sure extent_clear_unlock_delalloc() does not try to decrement again - * the data space_info's bytes_may_use counter, therefore we do not pass - * it the flag EXTENT_CLEAR_DATA_RESV. + * For the range (1). We have already instantiated the ordered extents + * for this region. They are cleaned up by + * btrfs_cleanup_ordered_extents() in e.g, + * btrfs_run_delalloc_range(). EXTENT_LOCKED | EXTENT_DELALLOC are + * already cleared in the above loop. And, EXTENT_DELALLOC_NEW | + * EXTENT_DEFRAG | EXTENT_CLEAR_META_RESV are handled by the cleanup + * function. + * + * However, in case of unlock == 0, we still need to unlock the pages + * (except @locked_page) to ensure all the pages are unlocked. + */ + if (!unlock && orig_start < start) + extent_clear_unlock_delalloc(inode, orig_start, start - 1, + locked_page, 0, page_ops); + + /* + * For the range (2). If we reserved an extent for our delalloc range + * (or a subrange) and failed to create the respective ordered extent, + * then it means that when we reserved the extent we decremented the + * extent's size from the data space_info's bytes_may_use counter and + * incremented the space_info's bytes_reserved counter by the same + * amount. We must make sure extent_clear_unlock_delalloc() does not try + * to decrement again the data space_info's bytes_may_use counter, + * therefore we do not pass it the flag EXTENT_CLEAR_DATA_RESV. */ if (extent_reserved) { extent_clear_unlock_delalloc(inode, start, @@ -1359,6 +1408,13 @@ static noinline int cow_file_range(struct btrfs_inode *inode, if (start >= end) goto out; } + + /* + * For the range (3). We never touched the region. In addition to the + * clear_bits above, we add EXTENT_CLEAR_DATA_RESV to release the data + * space_info's bytes_may_use counter, reserved in e.g, + * btrfs_check_data_free_space(). + */ extent_clear_unlock_delalloc(inode, start, end, locked_page, clear_bits | EXTENT_CLEAR_DATA_RESV, page_ops); From patchwork Thu Jun 16 15:45:40 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naohiro Aota X-Patchwork-Id: 12884391 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D49B2CCA47E for ; Thu, 16 Jun 2022 15:45:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1377179AbiFPPpz (ORCPT ); Thu, 16 Jun 2022 11:45:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52940 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237882AbiFPPpy (ORCPT ); Thu, 16 Jun 2022 11:45:54 -0400 Received: from esa6.hgst.iphmx.com (esa6.hgst.iphmx.com [216.71.154.45]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 94FC439172 for ; Thu, 16 Jun 2022 08:45:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1655394354; x=1686930354; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=KpSj8w6uqqCaVBn0k0Rp5RipretDmasRScMVOgX/ToA=; b=EWJCQr+08gIDtAG/zuhSqCxrdSd0dLEGMYSNQDDO8nco/fpBpUGfxsVH oR6UV0OJLUBneT/ongJiZuWz5PojYknc5zs7FJITpq6aIJGh7I7n/2Ap7 v0hj4vsdbkoMDizMifh6C4HdI+MRKjeBSplT5orJAu2FUuz39Nt7wuaR1 Sp8s+UPMryR3+UUmQYb99OjoA/8SjvXHswYVZ3wNbeqGIAeLQ/SKN7Umb aORTlfHbHIToDp/PIb3584YN6OdBKFSynYuvqtDI7M8GlceZGS7o5Gxi2 tBdeV6wL6klNfMHubs8enjTFj/ydpbepuvbH+48Vt3bQ181jhKL+DtK7u w==; X-IronPort-AV: E=Sophos;i="5.92,305,1650902400"; d="scan'208";a="204103912" Received: from h199-255-45-15.hgst.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 16 Jun 2022 23:45:52 +0800 IronPort-SDR: nBmkJz6mJ1BS88NAEl2TYZfc8qyvrETWT2mgImy2o06NQUwYkGtIp/t2qzzMFSRxuI7mhivW4b aNvJREFFwPZKGw0CsePaZAsJxV3/xPbdwwa0IUdph71VTbJe8N0wA5RN98Tx+tnYCBO2M7Rerf I0iqRXF+kTaIfD+CupKjvJ58LkN5FtrDz5mw179VqSIIzNVYy0qf1CA4dVn5QvDRw2gQa0U8IC bu9W2ZoiBd94wcAP596R1rbrM+esqGHJYwVG+eEY/OSH/jvZqPSXfyvEQx9E25C4xj2BvUHu9B 9rGmLZ+Ejbj/9AyL55eVFnPL Received: from uls-op-cesaip01.wdc.com ([10.248.3.36]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 16 Jun 2022 08:04:03 -0700 IronPort-SDR: RWe2B8v/pJyb2pK3tanNY/91cqU/WglOzVbIc9M09hddoXcKYaWCx1Y+VYbN5hzxYpWVky68pw h2S6sCv3p6iqITVjIBq2swa/6iBxm2hO3+J+rpNtsUvsGSztcyqEvKiW04PDsMDk0y2oM+oQTt CbN4BzQ39LNC2LPUkkStFmlzJXZg4kRl12eM9uYiag0EtQJ9jEzNDCSDSPjzMJPw5o68kJxAdt hpJH4QHxn4N+Fz7PF1jgp0unWnaN27IbO7WVtL6dfvCM4whUt0fsI+CTCye6ayxYnyyXaf3X86 g64= WDCIronportException: Internal Received: from jpf010151.ad.shared (HELO naota-xeon.wdc.com) ([10.225.50.117]) by uls-op-cesaip01.wdc.com with ESMTP; 16 Jun 2022 08:45:51 -0700 From: Naohiro Aota To: linux-btrfs@vger.kernel.org Cc: Naohiro Aota Subject: [PATCH 2/4] btrfs: extend btrfs_cleanup_ordered_extens for NULL locked_page Date: Fri, 17 Jun 2022 00:45:40 +0900 Message-Id: <6de954aed27f8e5ebccd780bbc40ce37a6ddf4f1.1655391633.git.naohiro.aota@wdc.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org btrfs_cleanup_ordered_extents() assumes locked_page to be non-NULL, so it is not usable for submit_uncompressed_range() which can habe NULL locked_page. This commit supports locked_page == NULL case. Also, it rewrites redundant "page_offset(locked_page)". Signed-off-by: Naohiro Aota --- fs/btrfs/inode.c | 36 +++++++++++++++++++++--------------- 1 file changed, 21 insertions(+), 15 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 0c3d9998470f..4e1100f84a88 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -195,11 +195,14 @@ static inline void btrfs_cleanup_ordered_extents(struct btrfs_inode *inode, { unsigned long index = offset >> PAGE_SHIFT; unsigned long end_index = (offset + bytes - 1) >> PAGE_SHIFT; - u64 page_start = page_offset(locked_page); - u64 page_end = page_start + PAGE_SIZE - 1; - + u64 page_start, page_end; struct page *page; + if (locked_page) { + page_start = page_offset(locked_page); + page_end = page_start + PAGE_SIZE - 1; + } + while (index <= end_index) { /* * For locked page, we will call end_extent_writepage() on it @@ -212,7 +215,7 @@ static inline void btrfs_cleanup_ordered_extents(struct btrfs_inode *inode, * btrfs_mark_ordered_io_finished() would skip the accounting * for the page range, and the ordered extent will never finish. */ - if (index == (page_offset(locked_page) >> PAGE_SHIFT)) { + if (locked_page && index == (page_start >> PAGE_SHIFT)) { index++; continue; } @@ -231,17 +234,20 @@ static inline void btrfs_cleanup_ordered_extents(struct btrfs_inode *inode, put_page(page); } - /* The locked page covers the full range, nothing needs to be done */ - if (bytes + offset <= page_offset(locked_page) + PAGE_SIZE) - return; - /* - * In case this page belongs to the delalloc range being instantiated - * then skip it, since the first page of a range is going to be - * properly cleaned up by the caller of run_delalloc_range - */ - if (page_start >= offset && page_end <= (offset + bytes - 1)) { - bytes = offset + bytes - page_offset(locked_page) - PAGE_SIZE; - offset = page_offset(locked_page) + PAGE_SIZE; + if (locked_page) { + /* The locked page covers the full range, nothing needs to be done */ + if (bytes + offset <= page_start + PAGE_SIZE) + return; + /* + * In case this page belongs to the delalloc range being + * instantiated then skip it, since the first page of a range is + * going to be properly cleaned up by the caller of + * run_delalloc_range + */ + if (page_start >= offset && page_end <= (offset + bytes - 1)) { + bytes = offset + bytes - page_offset(locked_page) - PAGE_SIZE; + offset = page_offset(locked_page) + PAGE_SIZE; + } } return __endio_write_update_ordered(inode, offset, bytes, false); From patchwork Thu Jun 16 15:45:41 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naohiro Aota X-Patchwork-Id: 12884393 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3C013C43334 for ; Thu, 16 Jun 2022 15:45:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1377209AbiFPPp5 (ORCPT ); Thu, 16 Jun 2022 11:45:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52942 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1377022AbiFPPpy (ORCPT ); Thu, 16 Jun 2022 11:45:54 -0400 Received: from esa6.hgst.iphmx.com (esa6.hgst.iphmx.com [216.71.154.45]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D3A0B35AA9 for ; Thu, 16 Jun 2022 08:45:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1655394354; x=1686930354; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=HbQk+ZYhsapevLe1HE4dRj1zWMfDynqlpqm2w/70e24=; b=f+sDFZoSCmYt35qxcceXpg+5ciKKVuMWfvMOZ2Shp8voI6QeOHqDLKHt iWxCy/2ZOUguGyh0+caBTSEAc7o5YCI2l5ABVHkfg9ilBPBuvC8YntrxS gWl5wggRqdpoyOlpSXYYKj0OnPajjaWQ7HuJIwJUbqObQ7BqXuuNLMGrR bMhSaWVjQK9o6yi1h7kuPuvmht5X/Ei+zfR/f1OX+JZvT5YsMFkAtk3AQ LIbHehwiK5n2dcISoozINJ8k+fBEpMB/rsEYWURSU0MH3jVfRn+bc5xwW ME99GTM5kbukRdg+AXmpwnZqzuuS2XqF7WLacCy7e1jDddpW7UE4EKMdO A==; X-IronPort-AV: E=Sophos;i="5.92,305,1650902400"; d="scan'208";a="204103913" Received: from h199-255-45-15.hgst.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 16 Jun 2022 23:45:52 +0800 IronPort-SDR: ExeNa7ItRvvXFSdrjof69MfQbXH15bR/X2JhChPN0okiKN7j0rbPxBuBa8XXqgiIRrNN9dzxnX P7cCnnOSjkJK6yvz2sMG6juZcPRSCIfHdSWMfuY+4G8QNjbj74ajwE0Juf3sAVAvxvSX2XfUrK HQt5I94J2wzjuNtiiFwzmAenC2y+bmjetVktaijQAiBQW/JtTe9V5bkHTLbVlIyDs5R9IphykF 6IRZCW+IbIyTI6gvoIbhoJGesIMIRMwa1MXkx4I5H3lW0XgFBEYPqSiXjNis8wGo+Wd4AN2v/y 12/E94mBln/OKLTPGL9i+wag Received: from uls-op-cesaip01.wdc.com ([10.248.3.36]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 16 Jun 2022 08:04:04 -0700 IronPort-SDR: nD2ZMXks8vPJ0dm7rAwWwhu8vwLNXlq/1jCE19A9w41De40C1sOxyVte7TRda0Qkv+Hj6IX1Z2 5FZfEJUV65k0tfMlPDchiTgnVP6LBk273h9/wl4DoR8f2E0q3bR3Ww7Yaw8l24Ppy1N63XS1PD TayTvayAJilQJlnxmg0S5YUcBJHRokbGG58Srp2koB/HJHUV3p1tyVCXcpAYPKeX4G7mgiF7Oq Q+yjHdOWt8+VlCTzh5+vPy3tE86sI/rIQbhuTA6CYccXWzFzXvULBNFiRApEdgGPCAGwe6qyr9 gmA= WDCIronportException: Internal Received: from jpf010151.ad.shared (HELO naota-xeon.wdc.com) ([10.225.50.117]) by uls-op-cesaip01.wdc.com with ESMTP; 16 Jun 2022 08:45:51 -0700 From: Naohiro Aota To: linux-btrfs@vger.kernel.org Cc: Naohiro Aota Subject: [PATCH 3/4] btrfs: fix error handling of fallbacked uncompress write Date: Fri, 17 Jun 2022 00:45:41 +0900 Message-Id: X-Mailer: git-send-email 2.35.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org When cow_file_range() fails in the middle of the allocation loop, it unlocks the pages but remains the ordered extents intact. Thus, we need to call btrfs_cleanup_ordered_extents() to finish the created ordered extents. Also, we need to call end_extent_writepage() if locked_page is available because btrfs_cleanup_ordered_extents() never process the region on the locked_page. Furthermore, we need to set the mapping as error if locked_page is unavailable before unlocking the pages, so that the errno is properly propagated to the userland. CC: stable@vger.kernel.org Signed-off-by: Naohiro Aota --- fs/btrfs/inode.c | 17 +++++++++++++++-- 1 file changed, 15 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 4e1100f84a88..cae15924fc99 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -934,8 +934,18 @@ static int submit_uncompressed_range(struct btrfs_inode *inode, goto out; } if (ret < 0) { - if (locked_page) + btrfs_cleanup_ordered_extents(inode, locked_page, start, end - start + 1); + if (locked_page) { + u64 page_start = page_offset(locked_page); + u64 page_end = page_start + PAGE_SIZE - 1; + + btrfs_page_set_error(inode->root->fs_info, locked_page, + page_start, PAGE_SIZE); + set_page_writeback(locked_page); + end_page_writeback(locked_page); + end_extent_writepage(locked_page, ret, page_start, page_end); unlock_page(locked_page); + } goto out; } @@ -1390,9 +1400,12 @@ static noinline int cow_file_range(struct btrfs_inode *inode, * However, in case of unlock == 0, we still need to unlock the pages * (except @locked_page) to ensure all the pages are unlocked. */ - if (!unlock && orig_start < start) + if (!unlock && orig_start < start) { + if (!locked_page) + mapping_set_error(inode->vfs_inode.i_mapping, ret); extent_clear_unlock_delalloc(inode, orig_start, start - 1, locked_page, 0, page_ops); + } /* * For the range (2). If we reserved an extent for our delalloc range From patchwork Thu Jun 16 15:45:42 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naohiro Aota X-Patchwork-Id: 12884390 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 83671CCA47A for ; Thu, 16 Jun 2022 15:45:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1377111AbiFPPpy (ORCPT ); Thu, 16 Jun 2022 11:45:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52928 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230491AbiFPPpx (ORCPT ); Thu, 16 Jun 2022 11:45:53 -0400 Received: from esa3.hgst.iphmx.com (esa3.hgst.iphmx.com [216.71.153.141]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 00ABC3915A for ; Thu, 16 Jun 2022 08:45:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1655394352; x=1686930352; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=FnUWjYVlcjE8no8J7C43gmV2S93JNcnEaSB2dpLOq+A=; b=GzProt0YDoPFK7pk0mCbiG83jVwaj10yjczhDSRLn/cm9e6Q8PDZjECZ 1e9INF9dvOteDiWsdIaZ3C1J5ofzYzZwsfrxB3Petbp9u5C+zgLkfckzo UHxKVaA5Ih2bXmmXYBncAOCgKBd95Ja/K8UGZjH2gdisBadjtsnwvLbvU CpWHX8PcDhwli/MVCq6CvfqJ2+GLn/mTA2oexztX5+7rJ6VoTLqmT/l0C lsmt82goKegf+J87hy8wMByfvnnqemI0PFeKmff1rv5jf1QPtpw8Avn5W C5XTGdbs2M6zP9pW0bUsJxLGjCqBJAQyFa/nXbU6v3jC1vUk1IZrg1J5g A==; X-IronPort-AV: E=Sophos;i="5.92,305,1650902400"; d="scan'208";a="208198920" Received: from h199-255-45-15.hgst.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 16 Jun 2022 23:45:52 +0800 IronPort-SDR: uj/aaKN39krKk+PbxvG7pRA0yGMhl6zn2cBYSlZV4C5Z4xo0yAhxhb8cSbEjZDcFak6N9DcBjR 2s7O2WyB71nY3eg3ZB+u7RO9I8FfofTvm5oWTRPXH0wopO6ptAD8a3h6uFI1e9Rlc98FIR19nT JsF9CGFvveC9GqYC+zcgOCRn/HtGKP/cqzG/ug6WQed7HK2AmthCwHHun5AtJAfMz9yVWaGceY KvdAHAOJK3TqjwkyeYVwyomZoTnGJgXR9RsJtXJrXSg3TRMVvKNMpVBw8sG4uni2lfO4hVUsde kQS5pduqX1BkCbjWiLkjIrd/ Received: from uls-op-cesaip01.wdc.com ([10.248.3.36]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 16 Jun 2022 08:04:05 -0700 IronPort-SDR: fnsC9IdEW/gCbWyo0c61zTCOGx143gWAA+Y5GwWnEOa/he/3VXxSL1i/KNSuZBsKE3ShVjY+eT D4zZcxq6EKdQfjptEdPs+r4qsT2QaTCyNIum6IGJwJximovp8aca33TXhaFBpgg2bsjBkQ4/Qy AEPS8ds5B6c0iDylf2EX5L+0p3D/kFMsHfj22AcawhgI5IRuwnIM39k/eu1xbNW09T6Z4PepXt nlmm3bAkblrnCOpLJagjv4CZ06+yW01DG0u4C5loHugYFwNJ7bLE1cYCqA0ZgYyj0VbwR534eb 8SI= WDCIronportException: Internal Received: from jpf010151.ad.shared (HELO naota-xeon.wdc.com) ([10.225.50.117]) by uls-op-cesaip01.wdc.com with ESMTP; 16 Jun 2022 08:45:52 -0700 From: Naohiro Aota To: linux-btrfs@vger.kernel.org Cc: Naohiro Aota Subject: [PATCH 4/4] btrfs: replace unnecessary goto with direct return Date: Fri, 17 Jun 2022 00:45:42 +0900 Message-Id: <7ccae9fc6975246cbb2be58c83d9ca6e3fcbb123.1655391633.git.naohiro.aota@wdc.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org The "goto out;"s in cow_file_range() just results in a simple "return ret;" which are not really useful. Replace them with proper direct "return"s. It also makes the success path vs failure path stands out. Signed-off-by: Naohiro Aota Reviewed-by: Filipe Manana --- fs/btrfs/inode.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index cae15924fc99..055c573e2eb3 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -1253,7 +1253,7 @@ static noinline int cow_file_range(struct btrfs_inode *inode, * inline extent or a compressed extent. */ unlock_page(locked_page); - goto out; + return 0; } else if (ret < 0) { goto out_unlock; } @@ -1366,8 +1366,7 @@ static noinline int cow_file_range(struct btrfs_inode *inode, if (ret) goto out_unlock; } -out: - return ret; + return 0; out_drop_extent_cache: btrfs_drop_extent_cache(inode, start, start + ram_size - 1, 0); @@ -1425,7 +1424,7 @@ static noinline int cow_file_range(struct btrfs_inode *inode, page_ops); start += cur_alloc_size; if (start >= end) - goto out; + return ret; } /* @@ -1437,7 +1436,7 @@ static noinline int cow_file_range(struct btrfs_inode *inode, extent_clear_unlock_delalloc(inode, start, end, locked_page, clear_bits | EXTENT_CLEAR_DATA_RESV, page_ops); - goto out; + return ret; } /*