From patchwork Sun Oct 2 13:24:10 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chandan Rajendra X-Patchwork-Id: 9359629 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id DE20A600C8 for ; Sun, 2 Oct 2016 13:25:39 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CD90428B16 for ; Sun, 2 Oct 2016 13:25:39 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id C123128B20; Sun, 2 Oct 2016 13:25:39 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 15C1728AC5 for ; Sun, 2 Oct 2016 13:25:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751347AbcJBNZZ (ORCPT ); Sun, 2 Oct 2016 09:25:25 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:50926 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750912AbcJBNZU (ORCPT ); Sun, 2 Oct 2016 09:25:20 -0400 Received: from pps.filterd (m0098396.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.17/8.16.0.17) with SMTP id u92DNCkJ046144 for ; Sun, 2 Oct 2016 09:25:14 -0400 Received: from e36.co.us.ibm.com (e36.co.us.ibm.com [32.97.110.154]) by mx0a-001b2d01.pphosted.com with ESMTP id 25t7f9q56a-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Sun, 02 Oct 2016 09:25:13 -0400 Received: from localhost by e36.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Sun, 2 Oct 2016 07:25:13 -0600 Received: from d03dlp02.boulder.ibm.com (9.17.202.178) by e36.co.us.ibm.com (192.168.1.136) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Sun, 2 Oct 2016 07:25:10 -0600 Received: from b03cxnp07028.gho.boulder.ibm.com (b03cxnp07028.gho.boulder.ibm.com [9.17.130.15]) by d03dlp02.boulder.ibm.com (Postfix) with ESMTP id 6E6CE3E4003E; Sun, 2 Oct 2016 07:25:10 -0600 (MDT) Received: from b03ledav003.gho.boulder.ibm.com (b03ledav003.gho.boulder.ibm.com [9.17.130.234]) by b03cxnp07028.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id u92DPA3R11928028; Sun, 2 Oct 2016 06:25:10 -0700 Received: from b03ledav003.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 50D6D6A03C; Sun, 2 Oct 2016 07:25:10 -0600 (MDT) Received: from localhost.in.ibm.com (unknown [9.79.217.200]) by b03ledav003.gho.boulder.ibm.com (Postfix) with ESMTP id EABFB6A03B; Sun, 2 Oct 2016 07:25:07 -0600 (MDT) From: Chandan Rajendra To: clm@fb.com, jbacik@fb.com, dsterba@suse.com Cc: Chandan Rajendra , linux-btrfs@vger.kernel.org Subject: [PATCH V21 01/19] Btrfs: subpage-blocksize: extent_clear_unlock_delalloc: Prevent page from being unlocked more than once Date: Sun, 2 Oct 2016 18:54:10 +0530 X-Mailer: git-send-email 2.5.5 In-Reply-To: <1475414668-25954-1-git-send-email-chandan@linux.vnet.ibm.com> References: <1475414668-25954-1-git-send-email-chandan@linux.vnet.ibm.com> X-TM-AS-GCONF: 00 X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16100213-0020-0000-0000-000009ED4B84 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00005841; HX=3.00000240; KW=3.00000007; PH=3.00000004; SC=3.00000186; SDB=6.00763599; UDB=6.00364348; IPR=6.00539010; BA=6.00004776; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00012850; XFM=3.00000011; UTC=2016-10-02 13:25:12 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 16100213-0021-0000-0000-0000561AC7E5 Message-Id: <1475414668-25954-2-git-send-email-chandan@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2016-10-02_05:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=2 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1609280000 definitions=main-1610020247 Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP extent_clear_unlock_delalloc() can unlock a page more than once as shown below (assume 4k as the block size and 64k as the page size). cow_file_range create 4k ordered extent corresponding to page offsets 0 - 4095 extent_clear_unlock_delalloc corresponding to page offsets 0 - 4095 unlock page create 4k ordered extent corresponding to page offsets 4096 - 8191 extent_clear_unlock_delalloc corresponding to page offsets 4096 - 8191 unlock page To prevent such a scenario this commit passes "delalloc end" to extent_clear_unlock_delalloc() to help decide whether the page can be unlocked or not. NOTE: Since extent_clear_unlock_delalloc() is used by compression code as well, the commit passes ordered extent "end" as the value for the argument corresponding to "delalloc end" for invocations made from compression code path. This will be fixed by a future commit that gets compression to work in subpage-blocksize scenario. Signed-off-by: Chandan Rajendra --- fs/btrfs/extent_io.c | 16 +++++++---- fs/btrfs/extent_io.h | 5 ++-- fs/btrfs/inode.c | 78 +++++++++++++++++++++++++++++----------------------- 3 files changed, 57 insertions(+), 42 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index f669240..dc60c604 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -1708,9 +1708,8 @@ out_failed: } void extent_clear_unlock_delalloc(struct inode *inode, u64 start, u64 end, - struct page *locked_page, - unsigned clear_bits, - unsigned long page_ops) + u64 delalloc_end, struct page *locked_page, + unsigned clear_bits, unsigned long page_ops) { struct extent_io_tree *tree = &BTRFS_I(inode)->io_tree; int ret; @@ -1718,6 +1717,7 @@ void extent_clear_unlock_delalloc(struct inode *inode, u64 start, u64 end, unsigned long index = start >> PAGE_SHIFT; unsigned long end_index = end >> PAGE_SHIFT; unsigned long nr_pages = end_index - index + 1; + u64 page_end; int i; clear_extent_bit(tree, start, end, clear_bits, 1, 0, NULL, GFP_NOFS); @@ -1748,8 +1748,14 @@ void extent_clear_unlock_delalloc(struct inode *inode, u64 start, u64 end, SetPageError(pages[i]); if (page_ops & PAGE_END_WRITEBACK) end_page_writeback(pages[i]); - if (page_ops & PAGE_UNLOCK) - unlock_page(pages[i]); + + if (page_ops & PAGE_UNLOCK) { + page_end = page_offset(pages[i]) + + PAGE_SIZE - 1; + if ((page_end <= end) + || (end == delalloc_end)) + unlock_page(pages[i]); + } put_page(pages[i]); } nr_pages -= ret; diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h index 06b6f14..0948bca 100644 --- a/fs/btrfs/extent_io.h +++ b/fs/btrfs/extent_io.h @@ -430,9 +430,8 @@ int map_private_extent_buffer(struct extent_buffer *eb, unsigned long offset, void extent_range_clear_dirty_for_io(struct inode *inode, u64 start, u64 end); void extent_range_redirty_for_io(struct inode *inode, u64 start, u64 end); void extent_clear_unlock_delalloc(struct inode *inode, u64 start, u64 end, - struct page *locked_page, - unsigned bits_to_clear, - unsigned long page_ops); + u64 delalloc_end, struct page *locked_page, + unsigned bits_to_clear, unsigned long page_ops); struct bio * btrfs_bio_alloc(struct block_device *bdev, u64 first_sector, int nr_vecs, gfp_t gfp_flags); diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 3440b52..3e4feac 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -560,12 +560,13 @@ cont: * we don't need to create any more async work items. * Unlock and free up our temp pages. */ - extent_clear_unlock_delalloc(inode, start, end, NULL, - clear_flags, PAGE_UNLOCK | - PAGE_CLEAR_DIRTY | - PAGE_SET_WRITEBACK | - page_error_op | - PAGE_END_WRITEBACK); + extent_clear_unlock_delalloc(inode, start, end, end, + NULL, clear_flags, + PAGE_UNLOCK + | PAGE_CLEAR_DIRTY + | PAGE_SET_WRITEBACK + | page_error_op + | PAGE_END_WRITEBACK); goto free_pages_out; } } @@ -835,6 +836,8 @@ retry: extent_clear_unlock_delalloc(inode, async_extent->start, async_extent->start + async_extent->ram_size - 1, + async_extent->start + + async_extent->ram_size - 1, NULL, EXTENT_LOCKED | EXTENT_DELALLOC, PAGE_UNLOCK | PAGE_CLEAR_DIRTY | PAGE_SET_WRITEBACK); @@ -854,9 +857,10 @@ retry: tree->ops->writepage_end_io_hook(p, start, end, NULL, 0); p->mapping = NULL; - extent_clear_unlock_delalloc(inode, start, end, NULL, 0, - PAGE_END_WRITEBACK | - PAGE_SET_ERROR); + extent_clear_unlock_delalloc(inode, start, end, end, + NULL, 0, + PAGE_END_WRITEBACK | + PAGE_SET_ERROR); free_async_extent_pages(async_extent); } alloc_hint = ins.objectid + ins.offset; @@ -871,6 +875,8 @@ out_free: extent_clear_unlock_delalloc(inode, async_extent->start, async_extent->start + async_extent->ram_size - 1, + async_extent->start + + async_extent->ram_size - 1, NULL, EXTENT_LOCKED | EXTENT_DELALLOC | EXTENT_DEFRAG | EXTENT_DO_ACCOUNTING, PAGE_UNLOCK | PAGE_CLEAR_DIRTY | @@ -942,6 +948,7 @@ static noinline int cow_file_range(struct inode *inode, struct btrfs_key ins; struct extent_map *em; struct extent_map_tree *em_tree = &BTRFS_I(inode)->extent_tree; + unsigned long page_ops, extent_ops; int ret = 0; if (btrfs_is_free_space_inode(inode)) { @@ -964,7 +971,8 @@ static noinline int cow_file_range(struct inode *inode, ret = cow_file_range_inline(root, inode, start, end, 0, 0, NULL); if (ret == 0) { - extent_clear_unlock_delalloc(inode, start, end, NULL, + extent_clear_unlock_delalloc(inode, start, end, + delalloc_end, NULL, EXTENT_LOCKED | EXTENT_DELALLOC | EXTENT_DEFRAG, PAGE_UNLOCK | PAGE_CLEAR_DIRTY | PAGE_SET_WRITEBACK | @@ -986,8 +994,6 @@ static noinline int cow_file_range(struct inode *inode, btrfs_drop_extent_cache(inode, start, start + num_bytes - 1, 0); while (disk_num_bytes > 0) { - unsigned long op; - cur_alloc_size = disk_num_bytes; ret = btrfs_reserve_extent(root, cur_alloc_size, root->sectorsize, 0, alloc_hint, @@ -1055,13 +1061,12 @@ static noinline int cow_file_range(struct inode *inode, * Do set the Private2 bit so we know this page was properly * setup for writepage */ - op = unlock ? PAGE_UNLOCK : 0; - op |= PAGE_SET_PRIVATE2; - - extent_clear_unlock_delalloc(inode, start, - start + ram_size - 1, locked_page, - EXTENT_LOCKED | EXTENT_DELALLOC, - op); + page_ops = unlock ? PAGE_UNLOCK : 0; + page_ops |= PAGE_SET_PRIVATE2; + extent_ops = EXTENT_LOCKED | EXTENT_DELALLOC; + extent_clear_unlock_delalloc(inode, start, start + ram_size - 1, + delalloc_end, locked_page, extent_ops, + page_ops); disk_num_bytes -= cur_alloc_size; num_bytes -= cur_alloc_size; alloc_hint = ins.objectid + ins.offset; @@ -1076,11 +1081,14 @@ out_reserve: btrfs_dec_block_group_reservations(root->fs_info, ins.objectid); btrfs_free_reserved_extent(root, ins.objectid, ins.offset, 1); out_unlock: - extent_clear_unlock_delalloc(inode, start, end, locked_page, - EXTENT_LOCKED | EXTENT_DO_ACCOUNTING | - EXTENT_DELALLOC | EXTENT_DEFRAG, - PAGE_UNLOCK | PAGE_CLEAR_DIRTY | - PAGE_SET_WRITEBACK | PAGE_END_WRITEBACK); + page_ops = unlock ? PAGE_UNLOCK : 0; + page_ops |= PAGE_CLEAR_DIRTY | PAGE_SET_WRITEBACK | PAGE_END_WRITEBACK + | PAGE_SET_ERROR; + extent_ops = EXTENT_LOCKED | EXTENT_DELALLOC | EXTENT_DO_ACCOUNTING + | EXTENT_DEFRAG; + + extent_clear_unlock_delalloc(inode, start, end, delalloc_end, + locked_page, extent_ops, page_ops); goto out; } @@ -1227,9 +1235,9 @@ static noinline int csum_exist_in_range(struct btrfs_root *root, * blocks on disk */ static noinline int run_delalloc_nocow(struct inode *inode, - struct page *locked_page, - u64 start, u64 end, int *page_started, int force, - unsigned long *nr_written) + struct page *locked_page, + u64 start, u64 end, int *page_started, + int force, unsigned long *nr_written) { struct btrfs_root *root = BTRFS_I(inode)->root; struct btrfs_trans_handle *trans; @@ -1255,7 +1263,8 @@ static noinline int run_delalloc_nocow(struct inode *inode, path = btrfs_alloc_path(); if (!path) { - extent_clear_unlock_delalloc(inode, start, end, locked_page, + extent_clear_unlock_delalloc(inode, start, end, end, + locked_page, EXTENT_LOCKED | EXTENT_DELALLOC | EXTENT_DO_ACCOUNTING | EXTENT_DEFRAG, PAGE_UNLOCK | @@ -1273,7 +1282,8 @@ static noinline int run_delalloc_nocow(struct inode *inode, trans = btrfs_join_transaction(root); if (IS_ERR(trans)) { - extent_clear_unlock_delalloc(inode, start, end, locked_page, + extent_clear_unlock_delalloc(inode, start, end, end, + locked_page, EXTENT_LOCKED | EXTENT_DELALLOC | EXTENT_DO_ACCOUNTING | EXTENT_DEFRAG, PAGE_UNLOCK | @@ -1487,10 +1497,10 @@ out_check: } extent_clear_unlock_delalloc(inode, cur_offset, - cur_offset + num_bytes - 1, - locked_page, EXTENT_LOCKED | - EXTENT_DELALLOC, PAGE_UNLOCK | - PAGE_SET_PRIVATE2); + cur_offset + num_bytes - 1, end, + locked_page, EXTENT_LOCKED | + EXTENT_DELALLOC, PAGE_UNLOCK | + PAGE_SET_PRIVATE2); if (!nolock && nocow) btrfs_end_write_no_snapshoting(root); cur_offset = extent_end; @@ -1517,7 +1527,7 @@ error: ret = err; if (ret && cur_offset < end) - extent_clear_unlock_delalloc(inode, cur_offset, end, + extent_clear_unlock_delalloc(inode, cur_offset, end, end, locked_page, EXTENT_LOCKED | EXTENT_DELALLOC | EXTENT_DEFRAG | EXTENT_DO_ACCOUNTING, PAGE_UNLOCK |