Message ID | 1459908294-11200-1-git-send-email-quwenruo@cn.fujitsu.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
On Wed, Apr 6, 2016 at 3:04 AM, Qu Wenruo <quwenruo@cn.fujitsu.com> wrote: > Current btrfs qgroup design implies a requirement that after calling > btrfs_qgroup_account_extents() there must be a commit root switch. > > Normally this is OK, as btrfs_qgroup_accounting_extents() is only called > inside btrfs_commit_transaction() just be commit_cowonly_roots(). btrfs_qgroup_accounting_extents() -> btrfs_qgroup_account_extents() just be -> just before > > However there is a exception at create_pending_snapshot(), which will > call btrfs_qgroup_account_extents() but no any commit root switch. "but no any commit root switch" -> without switching commit roots? > > In case of creating a snapshot whose parent root is itself (create a > snapshot of fs tree), it will corrupt qgroup by the following trace: > (skipped unrelated data) > ====== > btrfs_qgroup_account_extent: bytenr = 29786112, num_bytes = 16384, nr_old_roots = 0, nr_new_roots = 1 > qgroup_update_counters: qgid = 5, cur_old_count = 0, cur_new_count = 1, rfer = 0, excl = 0 > qgroup_update_counters: qgid = 5, cur_old_count = 0, cur_new_count = 1, rfer = 16384, excl = 16384 > btrfs_qgroup_account_extent: bytenr = 29786112, num_bytes = 16384, nr_old_roots = 0, nr_new_roots = 0 > ====== > > The problem here is in first qgroup_account_extent(), the > nr_new_roots of the extent is 1, which means its reference got > increased, and qgroup increased its rfer and excl. > > But at second qgroup_account_extent(), its reference got decreased, but > between these two qgroup_account_extent(), there is no switch roots. "there is no switch roots" -> there is no switch of commit roots > This leads to the same nr_old_roots, and this extent just got ignored by > qgroup, which means this extent is wrongly accounted. > > Fix it by call commit_cowonly_roots() after qgroup_account_extent() in > create_pending_snapshot(), with needed preparation. > > Reported-by: Mark Fasheh <mfasheh@suse.de> > Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Can we please get a test case for fstests? thanks > --- > fs/btrfs/transaction.c | 66 +++++++++++++++++++++++++++++++++++++------------- > 1 file changed, 49 insertions(+), 17 deletions(-) > > diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c > index 43885e5..a3eb8ac 100644 > --- a/fs/btrfs/transaction.c > +++ b/fs/btrfs/transaction.c > @@ -1516,6 +1516,55 @@ static noinline int create_pending_snapshot(struct btrfs_trans_handle *trans, > goto fail; > } > > + /* > + * Account qgroups before insert the dir item > + * As such dir item insert will modify parent_root, which could be > + * src root. If we don't do it now, wrong accounting may be inherited > + * to snapshot qgroup. > + * > + * For reason locking tree_log_mutex, see btrfs_commit_transaction() > + * comment > + */ > + mutex_lock(&root->fs_info->tree_log_mutex); > + > + ret = commit_fs_roots(trans, root); > + if (ret) { > + mutex_unlock(&root->fs_info->tree_log_mutex); > + goto fail; > + } > + > + btrfs_apply_pending_changes(root->fs_info); > + > + ret = btrfs_qgroup_prepare_account_extents(trans, root->fs_info); > + if (ret < 0) { > + mutex_unlock(&root->fs_info->tree_log_mutex); > + goto fail; > + } > + ret = btrfs_qgroup_account_extents(trans, root->fs_info); > + if (ret < 0) { > + mutex_unlock(&root->fs_info->tree_log_mutex); > + goto fail; > + } > + /* > + * Now qgroup are all updated, we can inherit it to new qgroups > + */ > + ret = btrfs_qgroup_inherit(trans, fs_info, > + root->root_key.objectid, > + objectid, pending->inherit); > + if (ret < 0) { > + mutex_unlock(&root->fs_info->tree_log_mutex); > + goto fail; > + } > + /* > + * qgroup_account_extents() must be followed by a > + * switch_commit_roots(), or next qgroup_account_extents() will > + * be corrupted > + */ > + ret = commit_cowonly_roots(trans, root); > + mutex_unlock(&root->fs_info->tree_log_mutex); > + if (ret) > + goto fail; > + > ret = btrfs_insert_dir_item(trans, parent_root, > dentry->d_name.name, dentry->d_name.len, > parent_inode, &key, > @@ -1559,23 +1608,6 @@ static noinline int create_pending_snapshot(struct btrfs_trans_handle *trans, > goto fail; > } > > - /* > - * account qgroup counters before qgroup_inherit() > - */ > - ret = btrfs_qgroup_prepare_account_extents(trans, fs_info); > - if (ret) > - goto fail; > - ret = btrfs_qgroup_account_extents(trans, fs_info); > - if (ret) > - goto fail; > - ret = btrfs_qgroup_inherit(trans, fs_info, > - root->root_key.objectid, > - objectid, pending->inherit); > - if (ret) { > - btrfs_abort_transaction(trans, root, ret); > - goto fail; > - } > - > fail: > pending->error = ret; > dir_item_existed: > -- > 2.8.0 > > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html
Filipe Manana wrote on 2016/04/06 10:20 +0100: > On Wed, Apr 6, 2016 at 3:04 AM, Qu Wenruo <quwenruo@cn.fujitsu.com> wrote: >> Current btrfs qgroup design implies a requirement that after calling >> btrfs_qgroup_account_extents() there must be a commit root switch. >> >> Normally this is OK, as btrfs_qgroup_accounting_extents() is only called >> inside btrfs_commit_transaction() just be commit_cowonly_roots(). > > btrfs_qgroup_accounting_extents() -> btrfs_qgroup_account_extents() > > just be -> just before > >> >> However there is a exception at create_pending_snapshot(), which will >> call btrfs_qgroup_account_extents() but no any commit root switch. > > "but no any commit root switch" -> without switching commit roots? > >> >> In case of creating a snapshot whose parent root is itself (create a >> snapshot of fs tree), it will corrupt qgroup by the following trace: >> (skipped unrelated data) >> ====== >> btrfs_qgroup_account_extent: bytenr = 29786112, num_bytes = 16384, nr_old_roots = 0, nr_new_roots = 1 >> qgroup_update_counters: qgid = 5, cur_old_count = 0, cur_new_count = 1, rfer = 0, excl = 0 >> qgroup_update_counters: qgid = 5, cur_old_count = 0, cur_new_count = 1, rfer = 16384, excl = 16384 >> btrfs_qgroup_account_extent: bytenr = 29786112, num_bytes = 16384, nr_old_roots = 0, nr_new_roots = 0 >> ====== >> >> The problem here is in first qgroup_account_extent(), the >> nr_new_roots of the extent is 1, which means its reference got >> increased, and qgroup increased its rfer and excl. >> >> But at second qgroup_account_extent(), its reference got decreased, but >> between these two qgroup_account_extent(), there is no switch roots. > > "there is no switch roots" -> there is no switch of commit roots My English is really poor... :( > >> This leads to the same nr_old_roots, and this extent just got ignored by >> qgroup, which means this extent is wrongly accounted. >> >> Fix it by call commit_cowonly_roots() after qgroup_account_extent() in >> create_pending_snapshot(), with needed preparation. >> >> Reported-by: Mark Fasheh <mfasheh@suse.de> >> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> > > Can we please get a test case for fstests? Test case will follow soon. Thanks, Qu > > thanks > >> --- >> fs/btrfs/transaction.c | 66 +++++++++++++++++++++++++++++++++++++------------- >> 1 file changed, 49 insertions(+), 17 deletions(-) >> >> diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c >> index 43885e5..a3eb8ac 100644 >> --- a/fs/btrfs/transaction.c >> +++ b/fs/btrfs/transaction.c >> @@ -1516,6 +1516,55 @@ static noinline int create_pending_snapshot(struct btrfs_trans_handle *trans, >> goto fail; >> } >> >> + /* >> + * Account qgroups before insert the dir item >> + * As such dir item insert will modify parent_root, which could be >> + * src root. If we don't do it now, wrong accounting may be inherited >> + * to snapshot qgroup. >> + * >> + * For reason locking tree_log_mutex, see btrfs_commit_transaction() >> + * comment >> + */ >> + mutex_lock(&root->fs_info->tree_log_mutex); >> + >> + ret = commit_fs_roots(trans, root); >> + if (ret) { >> + mutex_unlock(&root->fs_info->tree_log_mutex); >> + goto fail; >> + } >> + >> + btrfs_apply_pending_changes(root->fs_info); >> + >> + ret = btrfs_qgroup_prepare_account_extents(trans, root->fs_info); >> + if (ret < 0) { >> + mutex_unlock(&root->fs_info->tree_log_mutex); >> + goto fail; >> + } >> + ret = btrfs_qgroup_account_extents(trans, root->fs_info); >> + if (ret < 0) { >> + mutex_unlock(&root->fs_info->tree_log_mutex); >> + goto fail; >> + } >> + /* >> + * Now qgroup are all updated, we can inherit it to new qgroups >> + */ >> + ret = btrfs_qgroup_inherit(trans, fs_info, >> + root->root_key.objectid, >> + objectid, pending->inherit); >> + if (ret < 0) { >> + mutex_unlock(&root->fs_info->tree_log_mutex); >> + goto fail; >> + } >> + /* >> + * qgroup_account_extents() must be followed by a >> + * switch_commit_roots(), or next qgroup_account_extents() will >> + * be corrupted >> + */ >> + ret = commit_cowonly_roots(trans, root); >> + mutex_unlock(&root->fs_info->tree_log_mutex); >> + if (ret) >> + goto fail; >> + >> ret = btrfs_insert_dir_item(trans, parent_root, >> dentry->d_name.name, dentry->d_name.len, >> parent_inode, &key, >> @@ -1559,23 +1608,6 @@ static noinline int create_pending_snapshot(struct btrfs_trans_handle *trans, >> goto fail; >> } >> >> - /* >> - * account qgroup counters before qgroup_inherit() >> - */ >> - ret = btrfs_qgroup_prepare_account_extents(trans, fs_info); >> - if (ret) >> - goto fail; >> - ret = btrfs_qgroup_account_extents(trans, fs_info); >> - if (ret) >> - goto fail; >> - ret = btrfs_qgroup_inherit(trans, fs_info, >> - root->root_key.objectid, >> - objectid, pending->inherit); >> - if (ret) { >> - btrfs_abort_transaction(trans, root, ret); >> - goto fail; >> - } >> - >> fail: >> pending->error = ret; >> dir_item_existed: >> -- >> 2.8.0 >> >> >> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Qu, On Wed, Apr 06, 2016 at 10:04:54AM +0800, Qu Wenruo wrote: > Current btrfs qgroup design implies a requirement that after calling > btrfs_qgroup_account_extents() there must be a commit root switch. > > Normally this is OK, as btrfs_qgroup_accounting_extents() is only called > inside btrfs_commit_transaction() just be commit_cowonly_roots(). > > However there is a exception at create_pending_snapshot(), which will > call btrfs_qgroup_account_extents() but no any commit root switch. > > In case of creating a snapshot whose parent root is itself (create a > snapshot of fs tree), it will corrupt qgroup by the following trace: > (skipped unrelated data) > ====== > btrfs_qgroup_account_extent: bytenr = 29786112, num_bytes = 16384, nr_old_roots = 0, nr_new_roots = 1 > qgroup_update_counters: qgid = 5, cur_old_count = 0, cur_new_count = 1, rfer = 0, excl = 0 > qgroup_update_counters: qgid = 5, cur_old_count = 0, cur_new_count = 1, rfer = 16384, excl = 16384 > btrfs_qgroup_account_extent: bytenr = 29786112, num_bytes = 16384, nr_old_roots = 0, nr_new_roots = 0 > ====== > > The problem here is in first qgroup_account_extent(), the > nr_new_roots of the extent is 1, which means its reference got > increased, and qgroup increased its rfer and excl. > > But at second qgroup_account_extent(), its reference got decreased, but > between these two qgroup_account_extent(), there is no switch roots. > This leads to the same nr_old_roots, and this extent just got ignored by > qgroup, which means this extent is wrongly accounted. > > Fix it by call commit_cowonly_roots() after qgroup_account_extent() in > create_pending_snapshot(), with needed preparation. Thanks for the patch - I gave this all a whirl and get a locked up transaction thread. Here's the trace: [ 172.187269] NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [btrfs-transacti:1293] [ 172.188299] Modules linked in: btrfs(OE) rpcsec_gss_krb5(E) auth_rpcgss(E) nfsv4(E) dns_resolver(E) nfs(E) lockd(E) grace(E) sunrpc(E) fscache(E) iscsi_ibft(E) iscsi_boot_sysfs(E) af_packet(E) xor(E) raid6_pq(E) ppdev(E) pcspkr(E) serio_raw(E) virtio_balloon(E) i2c_piix4(E) parport_pc(E) parport(E) processor(E) button(E) dm_mod(E) ext4(E) crc16(E) mbcache(E) jbd2(E) ata_generic(E) virtio_blk(E) virtio_net(E) ata_piix(E) ahci(E) libahci(E) cirrus(E) drm_kms_helper(E) syscopyarea(E) sysfillrect(E) floppy(E) uhci_hcd(E) ehci_hcd(E) usbcore(E) usb_common(E) libata(E) virtio_pci(E) virtio_ring(E) virtio(E) sysimgblt(E) fb_sys_fops(E) ttm(E) drm(E) sg(E) scsi_mod(E) autofs4(E) [last unloaded: btrfs] [ 172.195789] CPU: 1 PID: 1293 Comm: btrfs-transacti Tainted: G OE 4.5.0-remove_qgroup+ #4 [ 172.196805] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20150524_160643-cloud127 04/01/2014 [ 172.198094] task: ffff8800bb90dbc0 ti: ffff880037204000 task.ti: ffff880037204000 [ 172.199077] RIP: 0010:[<ffffffff81561abb>] [<ffffffff81561abb>] _raw_spin_lock+0xb/0x20 [ 172.200076] RSP: 0018:ffff880037207d40 EFLAGS: 00000246 [ 172.200679] RAX: 0000000000000000 RBX: ffff88006f9aaf20 RCX: 0000000000000017 [ 172.201524] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff8800af6704e0 [ 172.202314] RBP: ffff880037207da0 R08: 0000000000000006 R09: 0000000000000000 [ 172.203102] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800b6301800 [ 172.203892] R13: ffffffffffffffff R14: ffff8800af670370 R15: ffff8800b6301800 [ 172.204687] FS: 0000000000000000(0000) GS:ffff88013fd00000(0000) knlGS:0000000000000000 [ 172.205586] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 172.206229] CR2: 00007f9db0ae4090 CR3: 0000000137510000 CR4: 00000000000006e0 [ 172.207025] Stack: [ 172.207259] ffffffffa05c51f4 0000000000000000 0000000000000006 000000000110ebb0 [ 172.208316] ffff8800af6704d0 ffff8800af6704e0 ffff88006f9aafb0 ffff88006f9aaf20 [ 172.209186] ffff8800ba10e828 ffff8800ba10ebb0 ffff8800ba10e9df ffff8800b6301800 [ 172.210034] Call Trace: [ 172.210322] [<ffffffffa05c51f4>] ? btrfs_run_delayed_refs+0xc4/0x2d0 [btrfs] [ 172.211128] [<ffffffffa05d7e8d>] commit_cowonly_roots+0x20d/0x2d0 [btrfs] [ 172.211904] [<ffffffffa05da5bc>] btrfs_commit_transaction+0x4cc/0xaa0 [btrfs] [ 172.212697] [<ffffffffa05d503a>] transaction_kthread+0x18a/0x210 [btrfs] [ 172.213465] [<ffffffffa05d4eb0>] ? btrfs_cleanup_transaction+0x530/0x530 [btrfs] [ 172.214422] [<ffffffff8107f7c4>] kthread+0xc4/0xe0 [ 172.214973] [<ffffffff8107f700>] ? kthread_park+0x50/0x50 [ 172.215589] [<ffffffff8156218f>] ret_from_fork+0x3f/0x70 [ 172.216171] [<ffffffff8107f700>] ? kthread_park+0x50/0x50 [ 172.216765] Code: 55 48 89 e5 8b 07 85 c0 74 04 31 c0 5d c3 ba 01 00 00 00 f0 0f b1 17 85 c0 75 ef b0 01 5d c3 90 31 c0 ba 01 00 00 00 f0 0f b1 17 <85> c0 75 01 c3 55 89 c6 48 89 e5 e8 a5 6d b4 ff 5d c3 0f 1f 00 I made a fresh file system, mounted it, populated it with some data and made a bunch of snapshots with some of their own exclusive data. The pertinent commands from my test script are: btrfs quota enable /btrfs mkdir /btrfs/snaps echo "populate /btrfs/ with some data" cp -a /usr/share /btrfs/ btrfs qgroup create 1/0 /btrfs for i in `seq -w 0 14`; do S="/btrfs/snaps/snap$i" echo "create and populate $S" btrfs su snap -i 1/0 /btrfs/ $S; cp -a /boot $S; done; for i in `seq -w 3 11 `; do S="/btrfs/snaps/snap$i" echo "remove snapshot $S" btrfs su de $S done; This is on Linux 4.5 with my inherit fix and your patch applied. The script I pasted above ran with no problems until I added your patch to my kernel so my guess is it's not related to the btrfs_qgroup_inherit() patch. Nonetheless, here's a link to it in case you want a 2nd look: http://thread.gmane.org/gmane.comp.file-systems.btrfs/54755 Thanks, --Mark -- Mark Fasheh -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Mark, Thanks for the test and reporting. Mark Fasheh wrote on 2016/04/06 19:43 -0700: > Hi Qu, > > On Wed, Apr 06, 2016 at 10:04:54AM +0800, Qu Wenruo wrote: >> Current btrfs qgroup design implies a requirement that after calling >> btrfs_qgroup_account_extents() there must be a commit root switch. >> >> Normally this is OK, as btrfs_qgroup_accounting_extents() is only called >> inside btrfs_commit_transaction() just be commit_cowonly_roots(). >> >> However there is a exception at create_pending_snapshot(), which will >> call btrfs_qgroup_account_extents() but no any commit root switch. >> >> In case of creating a snapshot whose parent root is itself (create a >> snapshot of fs tree), it will corrupt qgroup by the following trace: >> (skipped unrelated data) >> ====== >> btrfs_qgroup_account_extent: bytenr = 29786112, num_bytes = 16384, nr_old_roots = 0, nr_new_roots = 1 >> qgroup_update_counters: qgid = 5, cur_old_count = 0, cur_new_count = 1, rfer = 0, excl = 0 >> qgroup_update_counters: qgid = 5, cur_old_count = 0, cur_new_count = 1, rfer = 16384, excl = 16384 >> btrfs_qgroup_account_extent: bytenr = 29786112, num_bytes = 16384, nr_old_roots = 0, nr_new_roots = 0 >> ====== >> >> The problem here is in first qgroup_account_extent(), the >> nr_new_roots of the extent is 1, which means its reference got >> increased, and qgroup increased its rfer and excl. >> >> But at second qgroup_account_extent(), its reference got decreased, but >> between these two qgroup_account_extent(), there is no switch roots. >> This leads to the same nr_old_roots, and this extent just got ignored by >> qgroup, which means this extent is wrongly accounted. >> >> Fix it by call commit_cowonly_roots() after qgroup_account_extent() in >> create_pending_snapshot(), with needed preparation. > > Thanks for the patch - I gave this all a whirl and get a locked up > transaction thread. Here's the trace: > > [ 172.187269] NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [btrfs-transacti:1293] > [ 172.188299] Modules linked in: btrfs(OE) rpcsec_gss_krb5(E) auth_rpcgss(E) nfsv4(E) dns_resolver(E) nfs(E) lockd(E) grace(E) sunrpc(E) fscache(E) iscsi_ibft(E) iscsi_boot_sysfs(E) af_packet(E) xor(E) raid6_pq(E) ppdev(E) pcspkr(E) serio_raw(E) virtio_balloon(E) i2c_piix4(E) parport_pc(E) parport(E) processor(E) button(E) dm_mod(E) ext4(E) crc16(E) mbcache(E) jbd2(E) ata_generic(E) virtio_blk(E) virtio_net(E) ata_piix(E) ahci(E) libahci(E) cirrus(E) drm_kms_helper(E) syscopyarea(E) sysfillrect(E) floppy(E) uhci_hcd(E) ehci_hcd(E) usbcore(E) usb_common(E) libata(E) virtio_pci(E) virtio_ring(E) virtio(E) sysimgblt(E) fb_sys_fops(E) ttm(E) drm(E) sg(E) scsi_mod(E) autofs4(E) [last unloaded: btrfs] > [ 172.195789] CPU: 1 PID: 1293 Comm: btrfs-transacti Tainted: G OE 4.5.0-remove_qgroup+ #4 > [ 172.196805] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20150524_160643-cloud127 04/01/2014 > [ 172.198094] task: ffff8800bb90dbc0 ti: ffff880037204000 task.ti: ffff880037204000 > [ 172.199077] RIP: 0010:[<ffffffff81561abb>] [<ffffffff81561abb>] _raw_spin_lock+0xb/0x20 > [ 172.200076] RSP: 0018:ffff880037207d40 EFLAGS: 00000246 > [ 172.200679] RAX: 0000000000000000 RBX: ffff88006f9aaf20 RCX: 0000000000000017 > [ 172.201524] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff8800af6704e0 > [ 172.202314] RBP: ffff880037207da0 R08: 0000000000000006 R09: 0000000000000000 > [ 172.203102] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800b6301800 > [ 172.203892] R13: ffffffffffffffff R14: ffff8800af670370 R15: ffff8800b6301800 > [ 172.204687] FS: 0000000000000000(0000) GS:ffff88013fd00000(0000) knlGS:0000000000000000 > [ 172.205586] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > [ 172.206229] CR2: 00007f9db0ae4090 CR3: 0000000137510000 CR4: 00000000000006e0 > [ 172.207025] Stack: > [ 172.207259] ffffffffa05c51f4 0000000000000000 0000000000000006 000000000110ebb0 > [ 172.208316] ffff8800af6704d0 ffff8800af6704e0 ffff88006f9aafb0 ffff88006f9aaf20 > [ 172.209186] ffff8800ba10e828 ffff8800ba10ebb0 ffff8800ba10e9df ffff8800b6301800 > [ 172.210034] Call Trace: > [ 172.210322] [<ffffffffa05c51f4>] ? btrfs_run_delayed_refs+0xc4/0x2d0 [btrfs] > [ 172.211128] [<ffffffffa05d7e8d>] commit_cowonly_roots+0x20d/0x2d0 [btrfs] > [ 172.211904] [<ffffffffa05da5bc>] btrfs_commit_transaction+0x4cc/0xaa0 [btrfs] > [ 172.212697] [<ffffffffa05d503a>] transaction_kthread+0x18a/0x210 [btrfs] > [ 172.213465] [<ffffffffa05d4eb0>] ? btrfs_cleanup_transaction+0x530/0x530 [btrfs] > [ 172.214422] [<ffffffff8107f7c4>] kthread+0xc4/0xe0 > [ 172.214973] [<ffffffff8107f700>] ? kthread_park+0x50/0x50 > [ 172.215589] [<ffffffff8156218f>] ret_from_fork+0x3f/0x70 > [ 172.216171] [<ffffffff8107f700>] ? kthread_park+0x50/0x50 > [ 172.216765] Code: 55 48 89 e5 8b 07 85 c0 74 04 31 c0 5d c3 ba 01 00 00 00 f0 0f b1 17 85 c0 75 ef b0 01 5d c3 90 31 c0 ba 01 00 00 00 f0 0f b1 17 <85> c0 75 01 c3 55 89 c6 48 89 e5 e8 a5 6d b4 ff 5d c3 0f 1f 00 > I also got one locked up with similar test script. However I can't reproduce it in a stable rate. My test script is much the same with yours, but more controlled contents to populate the fs: ------ #!/bin/bash dev=/dev/sdb5 mnt=/mnt/test populate_mnt() { prefix=$1 size=$2 nr=$3 path=$4 local i for ((i = 0; i < $nr; i++)) do filename=$(printf "$prefix_%08d" $i) xfs_io -f -c "pwrite 0 $size" $path/$filename &> /dev/null done } do_one_test() { umount $dev &> /dev/null mkfs.btrfs -f $dev mount $dev $mnt btrfs quota enable $mnt mkdir $mnt/snapshots echo "populating base fs" populate_mnt inline1 2k 32 $mnt populate_mnt normal 1m 16 $mnt populate_mnt inline2 2k 32 $mnt btrfs qgroup create 1/0 $mnt for i in $(seq -w 0 25); do echo "populating snapshots $i" btrfs subvolume snapshot -i 1/0 $mnt $mnt/snapshots/snap_$i populate_mnt excl$i 1m 8 $mnt/snapshots/snap_$i sync done btrfs qgroup show -prce $mnt } for x in $(seq -w 0 25); do echo "loop $x" do_one_test done ------ I ran into one soft lockup with my patch only. So I assume it's not caused by your inherit patch though. But I didn't reproduce it once more. Not sure why. What's the reproduce rate in your environment? Thanks, Qu > > I made a fresh file system, mounted it, populated it with some data and made > a bunch of snapshots with some of their own exclusive data. The pertinent > commands from my test script are: > > btrfs quota enable /btrfs > mkdir /btrfs/snaps > echo "populate /btrfs/ with some data" > cp -a /usr/share /btrfs/ > btrfs qgroup create 1/0 /btrfs > for i in `seq -w 0 14`; do > S="/btrfs/snaps/snap$i" > echo "create and populate $S" > btrfs su snap -i 1/0 /btrfs/ $S; > cp -a /boot $S; > done; > for i in `seq -w 3 11 `; do > S="/btrfs/snaps/snap$i" > echo "remove snapshot $S" > btrfs su de $S > done; > > > This is on Linux 4.5 with my inherit fix and your patch applied. The script > I pasted above ran with no problems until I added your patch to my kernel so > my guess is it's not related to the btrfs_qgroup_inherit() patch. > Nonetheless, here's a link to it in case you want a 2nd look: > > http://thread.gmane.org/gmane.comp.file-systems.btrfs/54755 > > Thanks, > --Mark > > -- > Mark Fasheh > > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Apr 07, 2016 at 04:21:53PM +0800, Qu Wenruo wrote: > I ran into one soft lockup with my patch only. So I assume it's not > caused by your inherit patch though. > But I didn't reproduce it once more. Not sure why. > > What's the reproduce rate in your environment? It happens every time for me. Just wait about 30 seconds or so (my guess is to let a transaction commit kick in). Also I can force the issue to show up if I unmount. --Mark -- Mark Fasheh -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
====== btrfs_qgroup_account_extent: bytenr = 29786112, num_bytes = 16384, nr_old_roots = 0, nr_new_roots = 1 qgroup_update_counters: qgid = 5, cur_old_count = 0, cur_new_count = 1, rfer = 0, excl = 0 qgroup_update_counters: qgid = 5, cur_old_count = 0, cur_new_count = 1, rfer = 16384, excl = 16384 btrfs_qgroup_account_extent: bytenr = 29786112, num_bytes = 16384, nr_old_roots = 0, nr_new_roots = 0 ====== The problem here is in first qgroup_account_extent(), the nr_new_roots of the extent is 1, which means its reference got increased, and qgroup increased its rfer and excl. But at second qgroup_account_extent(), its reference got decreased, but between these two qgroup_account_extent(), there is no switch roots. This leads to the same nr_old_roots, and this extent just got ignored by qgroup, which means this extent is wrongly accounted. Fix it by call commit_cowonly_roots() after qgroup_account_extent() in create_pending_snapshot(), with needed preparation. Reported-by: Mark Fasheh <mfasheh@suse.de> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> --- fs/btrfs/transaction.c | 66 +++++++++++++++++++++++++++++++++++++------------- 1 file changed, 49 insertions(+), 17 deletions(-) diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c index 43885e5..a3eb8ac 100644 --- a/fs/btrfs/transaction.c +++ b/fs/btrfs/transaction.c @@ -1516,6 +1516,55 @@ static noinline int create_pending_snapshot(struct btrfs_trans_handle *trans, goto fail; } + /* + * Account qgroups before insert the dir item + * As such dir item insert will modify parent_root, which could be + * src root. If we don't do it now, wrong accounting may be inherited + * to snapshot qgroup. + * + * For reason locking tree_log_mutex, see btrfs_commit_transaction() + * comment + */ + mutex_lock(&root->fs_info->tree_log_mutex); + + ret = commit_fs_roots(trans, root); + if (ret) { + mutex_unlock(&root->fs_info->tree_log_mutex); + goto fail; + } + + btrfs_apply_pending_changes(root->fs_info); + + ret = btrfs_qgroup_prepare_account_extents(trans, root->fs_info); + if (ret < 0) { + mutex_unlock(&root->fs_info->tree_log_mutex); + goto fail; + } + ret = btrfs_qgroup_account_extents(trans, root->fs_info); + if (ret < 0) { + mutex_unlock(&root->fs_info->tree_log_mutex); + goto fail; + } + /* + * Now qgroup are all updated, we can inherit it to new qgroups + */ + ret = btrfs_qgroup_inherit(trans, fs_info, + root->root_key.objectid, + objectid, pending->inherit); + if (ret < 0) { + mutex_unlock(&root->fs_info->tree_log_mutex); + goto fail; + } + /* + * qgroup_account_extents() must be followed by a + * switch_commit_roots(), or next qgroup_account_extents() will + * be corrupted + */ + ret = commit_cowonly_roots(trans, root); + mutex_unlock(&root->fs_info->tree_log_mutex); + if (ret) + goto fail; + ret = btrfs_insert_dir_item(trans, parent_root, dentry->d_name.name, dentry->d_name.len, parent_inode, &key, @@ -1559,23 +1608,6 @@ static noinline int create_pending_snapshot(struct btrfs_trans_handle *trans, goto fail; } - /* - * account qgroup counters before qgroup_inherit() - */ - ret = btrfs_qgroup_prepare_account_extents(trans, fs_info); - if (ret) - goto fail; - ret = btrfs_qgroup_account_extents(trans, fs_info); - if (ret) - goto fail; - ret = btrfs_qgroup_inherit(trans, fs_info, - root->root_key.objectid, - objectid, pending->inherit); - if (ret) { - btrfs_abort_transaction(trans, root, ret); - goto fail; - } - fail: pending->error = ret; dir_item_existed: