Message ID | 20170327172957.2518-1-rgoldwyn@suse.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, Mar 27, 2017 at 12:29:57PM -0500, Goldwyn Rodrigues wrote: > From: Goldwyn Rodrigues <rgoldwyn@suse.com> > > We are facing the same problem with EDQUOT which was experienced with > ENOSPC. Not sure if we require a full ticketing system such as ENOSPC, but > here is a quick fix, which may be too big a hammer. > > Quotas are reserved during the start of an operation, incrementing > qg->reserved. However, it is written to disk in a commit_transaction > which could take as long as commit_interval. In the meantime there > could be deletions which are not accounted for because deletions are > accounted for only while committed (free_refroot). So, when we get > a EDQUOT flush the data to disk and try again. > > This fixes fstests btrfs/139. > > Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> > --- > Changes since v1: > - Changed start_delalloc_roots() to start_delalloc_inode() to target > the root in question only to reduce the amount of flush to be done. > - Added wait_ordered_extents(). > > Changes since v2: > - Revised patch header > - removed comment on combining conditions > - removed test case, to be done in fstests > > Changes sinve v3: > - testcase reinstated > - return value checks > > Changes since v4: > - removed testcase since btrfs/139 got incorporated in fstests The point was to keep the test inside the changelog as well. > - return statements corrected -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 03/27/2017 12:36 PM, David Sterba wrote: > On Mon, Mar 27, 2017 at 12:29:57PM -0500, Goldwyn Rodrigues wrote: >> From: Goldwyn Rodrigues <rgoldwyn@suse.com> >> >> We are facing the same problem with EDQUOT which was experienced with >> ENOSPC. Not sure if we require a full ticketing system such as ENOSPC, but >> here is a quick fix, which may be too big a hammer. >> >> Quotas are reserved during the start of an operation, incrementing >> qg->reserved. However, it is written to disk in a commit_transaction >> which could take as long as commit_interval. In the meantime there >> could be deletions which are not accounted for because deletions are >> accounted for only while committed (free_refroot). So, when we get >> a EDQUOT flush the data to disk and try again. >> >> This fixes fstests btrfs/139. >> >> Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> >> --- >> Changes since v1: >> - Changed start_delalloc_roots() to start_delalloc_inode() to target >> the root in question only to reduce the amount of flush to be done. >> - Added wait_ordered_extents(). >> >> Changes since v2: >> - Revised patch header >> - removed comment on combining conditions >> - removed test case, to be done in fstests >> >> Changes sinve v3: >> - testcase reinstated >> - return value checks >> >> Changes since v4: >> - removed testcase since btrfs/139 got incorporated in fstests > > The point was to keep the test inside the changelog as well. Yes, that was before we had btrfs/139 in fstest. I have put it in the patch header. However, if you really want the test script back, I can put it there. Let me know.
On Mon, Mar 27, 2017 at 01:13:46PM -0500, Goldwyn Rodrigues wrote: > > > On 03/27/2017 12:36 PM, David Sterba wrote: > > On Mon, Mar 27, 2017 at 12:29:57PM -0500, Goldwyn Rodrigues wrote: > >> From: Goldwyn Rodrigues <rgoldwyn@suse.com> > >> > >> We are facing the same problem with EDQUOT which was experienced with > >> ENOSPC. Not sure if we require a full ticketing system such as ENOSPC, but > >> here is a quick fix, which may be too big a hammer. > >> > >> Quotas are reserved during the start of an operation, incrementing > >> qg->reserved. However, it is written to disk in a commit_transaction > >> which could take as long as commit_interval. In the meantime there > >> could be deletions which are not accounted for because deletions are > >> accounted for only while committed (free_refroot). So, when we get > >> a EDQUOT flush the data to disk and try again. > >> > >> This fixes fstests btrfs/139. > >> > >> Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> > >> --- > >> Changes since v1: > >> - Changed start_delalloc_roots() to start_delalloc_inode() to target > >> the root in question only to reduce the amount of flush to be done. > >> - Added wait_ordered_extents(). > >> > >> Changes since v2: > >> - Revised patch header > >> - removed comment on combining conditions > >> - removed test case, to be done in fstests > >> > >> Changes sinve v3: > >> - testcase reinstated > >> - return value checks > >> > >> Changes since v4: > >> - removed testcase since btrfs/139 got incorporated in fstests > > > > The point was to keep the test inside the changelog as well. > > > Yes, that was before we had btrfs/139 in fstest. I have put it in the > patch header. However, if you really want the test script back, I can > put it there. Let me know. I think the test steps are worth keeping in the changelog, even if there's a fstest. I'll copy it from v4, patch added to 4.12 queue. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
At 03/28/2017 02:13 AM, Goldwyn Rodrigues wrote: > > > On 03/27/2017 12:36 PM, David Sterba wrote: >> On Mon, Mar 27, 2017 at 12:29:57PM -0500, Goldwyn Rodrigues wrote: >>> From: Goldwyn Rodrigues <rgoldwyn@suse.com> >>> >>> We are facing the same problem with EDQUOT which was experienced with >>> ENOSPC. Not sure if we require a full ticketing system such as ENOSPC, but >>> here is a quick fix, which may be too big a hammer. >>> >>> Quotas are reserved during the start of an operation, incrementing >>> qg->reserved. However, it is written to disk in a commit_transaction >>> which could take as long as commit_interval. In the meantime there >>> could be deletions which are not accounted for because deletions are >>> accounted for only while committed (free_refroot). So, when we get >>> a EDQUOT flush the data to disk and try again. >>> >>> This fixes fstests btrfs/139. This patch is causing hang for inode_cache mount option. Which can be easily triggered by btrfs/042 with inode_cache. The callback trace will be: Call Trace: __schedule+0x374/0xaf0 schedule+0x3d/0x90 wait_for_commit+0x4a/0x80 [btrfs] ? wake_atomic_t_function+0x60/0x60 btrfs_commit_transaction+0xe0/0xa10 [btrfs] ? start_transaction+0xad/0x510 [btrfs] qgroup_reserve+0x1f0/0x350 [btrfs] btrfs_qgroup_reserve_data+0xf8/0x2f0 [btrfs] ? _raw_spin_unlock+0x27/0x40 btrfs_check_data_free_space+0x6d/0xb0 [btrfs] btrfs_delalloc_reserve_space+0x25/0x70 [btrfs] btrfs_save_ino_cache+0x402/0x650 [btrfs] commit_fs_roots+0xb7/0x170 [btrfs] btrfs_commit_transaction+0x425/0xa10 [btrfs] qgroup_reserve+0x1f0/0x350 [btrfs] btrfs_qgroup_reserve_data+0xf8/0x2f0 [btrfs] ? _raw_spin_unlock+0x27/0x40 btrfs_check_data_free_space+0x6d/0xb0 [btrfs] btrfs_delalloc_reserve_space+0x25/0x70 [btrfs] btrfs_direct_IO+0x1c5/0x3b0 [btrfs] generic_file_direct_write+0xab/0x150 btrfs_file_write_iter+0x243/0x530 [btrfs] __vfs_write+0xc9/0x120 vfs_write+0xcb/0x1f0 SyS_pwrite64+0x79/0x90 entry_SYSCALL_64_fastpath+0x18/0xad We're calling btrfs_commit_transaction() inside btrfs_commit_transaction(). Which will definitely hang the system. Any idea to fix it? Thanks, Qu >>> >>> Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> >>> --- >>> Changes since v1: >>> - Changed start_delalloc_roots() to start_delalloc_inode() to target >>> the root in question only to reduce the amount of flush to be done. >>> - Added wait_ordered_extents(). >>> >>> Changes since v2: >>> - Revised patch header >>> - removed comment on combining conditions >>> - removed test case, to be done in fstests >>> >>> Changes sinve v3: >>> - testcase reinstated >>> - return value checks >>> >>> Changes since v4: >>> - removed testcase since btrfs/139 got incorporated in fstests >> >> The point was to keep the test inside the changelog as well. > > > Yes, that was before we had btrfs/139 in fstest. I have put it in the > patch header. However, if you really want the test script back, I can > put it there. Let me know. > > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c index a5da750..341c594 100644 --- a/fs/btrfs/qgroup.c +++ b/fs/btrfs/qgroup.c @@ -2367,6 +2367,7 @@ static int qgroup_reserve(struct btrfs_root *root, u64 num_bytes, bool enforce) struct btrfs_fs_info *fs_info = root->fs_info; u64 ref_root = root->root_key.objectid; int ret = 0; + int retried = 0; struct ulist_node *unode; struct ulist_iterator uiter; @@ -2375,7 +2376,7 @@ static int qgroup_reserve(struct btrfs_root *root, u64 num_bytes, bool enforce) if (num_bytes == 0) return 0; - +retry: spin_lock(&fs_info->qgroup_lock); quota_root = fs_info->quota_root; if (!quota_root) @@ -2402,6 +2403,27 @@ static int qgroup_reserve(struct btrfs_root *root, u64 num_bytes, bool enforce) qg = unode_aux_to_qgroup(unode); if (enforce && !qgroup_check_limits(qg, num_bytes)) { + /* + * Commit the tree and retry, since we may have + * deletions which would free up space. + */ + if (!retried && qg->reserved > 0) { + struct btrfs_trans_handle *trans; + spin_unlock(&fs_info->qgroup_lock); + ret = btrfs_start_delalloc_inodes(root, 0); + if (ret) + return ret; + btrfs_wait_ordered_extents(root, -1, 0, + (u64)-1); + trans = btrfs_join_transaction(root); + if (IS_ERR(trans)) + return PTR_ERR(trans); + ret = btrfs_commit_transaction(trans); + if (ret) + return ret; + retried++; + goto retry; + } ret = -EDQUOT; goto out; }