Message ID | 501A04CE.9090408@cn.fujitsu.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi, appologies for late reply, On Thu, Aug 02, 2012 at 12:40:46PM +0800, Miao Xie wrote: > Changelog v1 -> v2: > - add comment to explain why we need deal with the delayed items after > snapshot creation and why this operation do not corrupt the metadata. I'm sorry, the comment did not fix the bug :) The subvol stress is able to hit this: [ 2360.444321] ------------[ cut here ]------------ [ 2360.448019] kernel BUG at fs/btrfs/extent-tree.c:6047! [ 2360.448019] invalid opcode: 0000 [#1] SMP [ 2360.448019] CPU 0 [ 2360.448019] Modules linked in: btrfs aoe [last unloaded: btrfs] [ 2360.448019] [ 2360.448019] Pid: 8212, comm: btrfs Not tainted 3.5.0-default+ #170 Intel Corporation Santa Rosa platform/Matanzas [ 2360.448019] RIP: 0010:[<ffffffffa00f62a1>] [<ffffffffa00f62a1>] run_clustered_refs+0xa11/0xa20 [btrfs] [ 2360.448019] RSP: 0018:ffff88003eca1a68 EFLAGS: 00010246 [ 2360.448019] RAX: 00000000000007ff RBX: ffff880017a694c8 RCX: ffff88003eca1a08 [ 2360.448019] RDX: ffff880028aa9000 RSI: 00000000000007fe RDI: ffff880064223cf0 [ 2360.448019] RBP: ffff88003eca1b48 R08: 00000000000007ff R09: ffff88003eca19f8 [ 2360.448019] R10: ffff88002435d1e8 R11: 0000000000000000 R12: ffff880025d66d28 [ 2360.448019] R13: ffff880038640000 R14: ffff8800778dfa88 R15: ffff880060f010d0 [ 2360.448019] FS: 00007f3289f35740(0000) GS:ffff88007dc00000(0000) knlGS:0000000000000000 [ 2360.448019] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 2360.448019] CR2: ffffffffff600400 CR3: 000000002e112000 CR4: 00000000000007f0 [ 2360.448019] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 2360.448019] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 2360.448019] Process btrfs (pid: 8212, threadinfo ffff88003eca0000, task ffff88001d834200) [ 2360.448019] Stack: [ 2360.448019] 0000000000000000 0000000000000000 0000000000000001 0000000000000000 [ 2360.448019] 00000000000007ed ffff88002435d1e8 000000003eca1b18 0000000000000000 [ 2360.448019] 0000000000000770 0000000000000000 000000005cb1e000 ffff88003eca1c08 [ 2360.448019] Call Trace: [ 2360.448019] [<ffffffffa00f6479>] btrfs_run_delayed_refs+0x1c9/0x550 [btrfs] [ 2360.448019] [<ffffffff810a4d15>] ? trace_hardirqs_on_caller+0x155/0x1d0 [ 2360.448019] [<ffffffffa00e306a>] ? btrfs_free_path+0x2a/0x40 [btrfs] [ 2360.448019] [<ffffffffa015c741>] ? btrfs_run_delayed_items+0xf1/0x160 [btrfs] [ 2360.448019] [<ffffffffa0108a15>] btrfs_commit_transaction+0x605/0xb00 [btrfs] [ 2360.448019] [<ffffffff8109e70d>] ? lock_release_holdtime+0x3d/0x1c0 [ 2360.448019] [<ffffffffa013fc88>] ? btrfs_mksubvol+0x298/0x360 [btrfs] [ 2360.448019] [<ffffffff8106d210>] ? wake_up_bit+0x40/0x40 [ 2360.448019] [<ffffffff8137d88e>] ? do_raw_spin_unlock+0x5e/0xb0 [ 2360.448019] [<ffffffffa013fd48>] btrfs_mksubvol+0x358/0x360 [btrfs] [ 2360.448019] [<ffffffffa013fe5a>] btrfs_ioctl_snap_create_transid+0x10a/0x190 [btrfs] [ 2360.448019] [<ffffffffa014005d>] btrfs_ioctl_snap_create_v2.clone.0+0xfd/0x110 [btrfs] [ 2360.448019] [<ffffffffa01419ee>] btrfs_ioctl+0x48e/0x1340 [btrfs] [ 2360.448019] [<ffffffff818f0f00>] ? do_page_fault+0x2d0/0x580 [ 2360.448019] [<ffffffff818eca70>] ? _raw_spin_unlock_irq+0x30/0x50 [ 2360.448019] [<ffffffff81078463>] ? finish_task_switch+0x83/0xf0 [ 2360.448019] [<ffffffff81161d08>] do_vfs_ioctl+0x98/0x560 [ 2360.448019] [<ffffffff818ed215>] ? retint_swapgs+0x13/0x1b [ 2360.448019] [<ffffffff8116221f>] sys_ioctl+0x4f/0x80 [ 2360.448019] [<ffffffff818f56e9>] system_call_fastpath+0x16/0x1b [ 2360.448019] Code: 8b 76 40 48 89 d7 48 89 55 a0 e8 2b 74 ff ff 83 f8 17 0f 87 1e ff ff ff 0f 0b 80 fa b2 0f 84 b4 f8 ff ff 0f 0b 0f 0b 0f 0b 0f 0b <0f> 0b 0f 0b 0f 0b 0f 0b 0f 1f 80 00 00 00 00 55 48 89 e5 41 57 [ 2360.448019] RIP [<ffffffffa00f62a1>] run_clustered_refs+0xa11/0xa20 [btrfs] [ 2360.448019] RSP <ffff88003eca1a68> [ 2360.814508] ---[ end trace 555a16cac3620ccb ]--- [ 2360.820398] note: btrfs[8212] exited with preempt_count 1 [ 2360.827072] BUG: sleeping function called from invalid context at kernel/rwsem.c:20 [ 2360.836047] in_atomic(): 1, irqs_disabled(): 0, pid: 8212, name: btrfs [ 2360.843859] INFO: lockdep is turned off. [ 2360.849021] Pid: 8212, comm: btrfs Tainted: G D 3.5.0-default+ #170 [ 2360.849022] Call Trace: [ 2360.849027] [<ffffffff8107a40c>] __might_sleep+0xfc/0x130 [ 2360.849030] [<ffffffff818ea0f6>] down_read+0x26/0xa0 [ 2360.849034] [<ffffffff810b416b>] acct_collect+0x4b/0x1b0 [ 2360.849038] [<ffffffff8104c838>] do_exit+0x718/0x9a0 [ 2360.849041] [<ffffffff81049a26>] ? kmsg_dump+0x26/0x140 [ 2360.849043] [<ffffffff818ee0c0>] oops_end+0xb0/0xf0 [ 2360.849046] [<ffffffff81005a7b>] die+0x5b/0x90 [ 2360.849048] [<ffffffff818ed9a4>] do_trap+0xc4/0x170 [ 2360.849052] [<ffffffff810030a5>] do_invalid_op+0x95/0xb0 [ 2360.849067] [<ffffffffa00f62a1>] ? run_clustered_refs+0xa11/0xa20 [btrfs] [ 2360.849071] [<ffffffff813779dd>] ? trace_hardirqs_off_thunk+0x3a/0x3c [ 2360.849073] [<ffffffff818ed260>] ? restore_args+0x30/0x30 [ 2360.849076] [<ffffffff818f674b>] invalid_op+0x1b/0x20 [ 2360.849087] [<ffffffffa00f62a1>] ? run_clustered_refs+0xa11/0xa20 [btrfs] [ 2360.849097] [<ffffffffa00f5f2b>] ? run_clustered_refs+0x69b/0xa20 [btrfs] [ 2360.849108] [<ffffffffa00f6479>] btrfs_run_delayed_refs+0x1c9/0x550 [btrfs] [ 2360.849110] [<ffffffff810a4d15>] ? trace_hardirqs_on_caller+0x155/0x1d0 [ 2360.849119] [<ffffffffa00e306a>] ? btrfs_free_path+0x2a/0x40 [btrfs] [ 2360.849133] [<ffffffffa015c741>] ? btrfs_run_delayed_items+0xf1/0x160 [btrfs] [ 2360.849145] [<ffffffffa0108a15>] btrfs_commit_transaction+0x605/0xb00 [btrfs] [ 2360.849148] [<ffffffff8109e70d>] ? lock_release_holdtime+0x3d/0x1c0 [ 2360.849161] [<ffffffffa013fc88>] ? btrfs_mksubvol+0x298/0x360 [btrfs] [ 2360.849164] [<ffffffff8106d210>] ? wake_up_bit+0x40/0x40 [ 2360.849166] [<ffffffff8137d88e>] ? do_raw_spin_unlock+0x5e/0xb0 [ 2360.849180] [<ffffffffa013fd48>] btrfs_mksubvol+0x358/0x360 [btrfs] [ 2360.849194] [<ffffffffa013fe5a>] btrfs_ioctl_snap_create_transid+0x10a/0x190 [btrfs] [ 2360.849207] [<ffffffffa014005d>] btrfs_ioctl_snap_create_v2.clone.0+0xfd/0x110 [btrfs] [ 2360.849221] [<ffffffffa01419ee>] btrfs_ioctl+0x48e/0x1340 [btrfs] [ 2360.849224] [<ffffffff818f0f00>] ? do_page_fault+0x2d0/0x580 [ 2360.849226] [<ffffffff818eca70>] ? _raw_spin_unlock_irq+0x30/0x50 [ 2360.849229] [<ffffffff81078463>] ? finish_task_switch+0x83/0xf0 [ 2360.849231] [<ffffffff81161d08>] do_vfs_ioctl+0x98/0x560 [ 2360.849234] [<ffffffff818ed215>] ? retint_swapgs+0x13/0x1b [ 2360.849236] [<ffffffff8116221f>] sys_ioctl+0x4f/0x80 [ 2360.849239] [<ffffffff818f56e9>] system_call_fastpath+0x16/0x1b [ 2360.849255] BUG: scheduling while atomic: btrfs/8212/0x10000002 [ 2360.849256] INFO: lockdep is turned off. [ 2360.849257] Modules linked in: btrfs aoe [last unloaded: btrfs] [ 2360.849261] Pid: 8212, comm: btrfs Tainted: G D 3.5.0-default+ #170 [ 2360.849262] Call Trace: [ 2360.849262] [<ffffffff81078318>] __schedule_bug+0x68/0x90 [ 2360.849265] [<ffffffff818eafcc>] __schedule+0x73c/0x810 [ 2360.849268] [<ffffffff8107b48a>] __cond_resched+0x2a/0x40 [ 2360.849270] [<ffffffff818eb121>] _cond_resched+0x31/0x40 [ 2360.849273] [<ffffffff81128e13>] unmap_single_vma+0x493/0x750 [ 2360.849276] [<ffffffff811100b0>] ? lru_deactivate_fn+0x1e0/0x1e0 [ 2360.849279] [<ffffffff810a4be0>] ? trace_hardirqs_on_caller+0x20/0x1d0 [ 2360.849281] [<ffffffff8112986c>] unmap_vmas+0x3c/0x60 [ 2360.849284] [<ffffffff81130de1>] exit_mmap+0x81/0x140 [ 2360.849287] [<ffffffff81043824>] mmput+0x74/0x130 [ 2360.849289] [<ffffffff8104a520>] exit_mm+0x100/0x120 [ 2360.849292] [<ffffffff8104c858>] do_exit+0x738/0x9a0 [ 2360.849294] [<ffffffff81049a26>] ? kmsg_dump+0x26/0x140 [ 2360.849297] [<ffffffff818ee0c0>] oops_end+0xb0/0xf0 [ 2360.849299] [<ffffffff81005a7b>] die+0x5b/0x90 [ 2360.849301] [<ffffffff818ed9a4>] do_trap+0xc4/0x170 [ 2360.849304] [<ffffffff810030a5>] do_invalid_op+0x95/0xb0 [ 2360.849307] [<ffffffffa00f62a1>] ? run_clustered_refs+0xa11/0xa20 [btrfs] [ 2360.849317] [<ffffffff813779dd>] ? trace_hardirqs_off_thunk+0x3a/0x3c [ 2360.849320] [<ffffffff818ed260>] ? restore_args+0x30/0x30 [ 2360.849322] [<ffffffff818f674b>] invalid_op+0x1b/0x20 [ 2360.849325] [<ffffffffa00f62a1>] ? run_clustered_refs+0xa11/0xa20 [btrfs] [ 2360.849335] [<ffffffffa00f5f2b>] ? run_clustered_refs+0x69b/0xa20 [btrfs] [ 2360.849346] [<ffffffffa00f6479>] btrfs_run_delayed_refs+0x1c9/0x550 [btrfs] [ 2360.849356] [<ffffffff810a4d15>] ? trace_hardirqs_on_caller+0x155/0x1d0 [ 2360.849358] [<ffffffffa00e306a>] ? btrfs_free_path+0x2a/0x40 [btrfs] [ 2360.849367] [<ffffffffa015c741>] ? btrfs_run_delayed_items+0xf1/0x160 [btrfs] [ 2360.849380] [<ffffffffa0108a15>] btrfs_commit_transaction+0x605/0xb00 [btrfs] [ 2360.849393] [<ffffffff8109e70d>] ? lock_release_holdtime+0x3d/0x1c0 [ 2360.849395] [<ffffffffa013fc88>] ? btrfs_mksubvol+0x298/0x360 [btrfs] [ 2360.849409] [<ffffffff8106d210>] ? wake_up_bit+0x40/0x40 [ 2360.849411] [<ffffffff8137d88e>] ? do_raw_spin_unlock+0x5e/0xb0 [ 2360.849413] [<ffffffffa013fd48>] btrfs_mksubvol+0x358/0x360 [btrfs] [ 2360.849427] [<ffffffffa013fe5a>] btrfs_ioctl_snap_create_transid+0x10a/0x190 [btrfs] [ 2360.849441] [<ffffffffa014005d>] btrfs_ioctl_snap_create_v2.clone.0+0xfd/0x110 [btrfs] [ 2360.849455] [<ffffffffa01419ee>] btrfs_ioctl+0x48e/0x1340 [btrfs] [ 2360.849469] [<ffffffff818f0f00>] ? do_page_fault+0x2d0/0x580 [ 2360.849471] [<ffffffff818eca70>] ? _raw_spin_unlock_irq+0x30/0x50 [ 2360.849473] [<ffffffff81078463>] ? finish_task_switch+0x83/0xf0 [ 2360.849476] [<ffffffff81161d08>] do_vfs_ioctl+0x98/0x560 [ 2360.849478] [<ffffffff818ed215>] ? retint_swapgs+0x13/0x1b [ 2360.849481] [<ffffffff8116221f>] sys_ioctl+0x4f/0x80 [ 2360.849483] [<ffffffff818f56e9>] system_call_fastpath+0x16/0x1b fs/btrfs/extent-tree.c:6047 6046 if (parent > 0) { 6047 BUG_ON(!(flags & BTRFS_BLOCK_FLAG_FULL_BACKREF)); 6048 btrfs_set_extent_inline_ref_type(leaf, iref, 6049 BTRFS_SHARED_BLOCK_REF_KEY); 6050 btrfs_set_extent_inline_ref_offset(leaf, iref, parent); 6051 } else { 6052 btrfs_set_extent_inline_ref_type(leaf, iref, 6053 BTRFS_TREE_BLOCK_REF_KEY); 6054 btrfs_set_extent_inline_ref_offset(leaf, iref, root_objectid); 6055 } Currently for-linux hangs early during the test, so I applied V3 patches on top of 3.5. The filesystem is freshly created, the load is to simultaneously unpack large tar, snapshot the fs, delete random snapshot, looped rm of the untarred dir. Crashes after some minutes, reliably. Fsck spits lots of errors: ref mismatch on [1133031424 4096] extent item 1, found 0 Backref 1133031424 root 5 not referenced back 0x7d1f40 Incorrect global backref count on 1133031424 found 1 wanted 0 backpointer mismatch on [1133031424 4096] owner ref check failed [1133031424 4096] ref mismatch on [11213131776 16384] extent item 1, found 0 Incorrect local backref count on 11213131776 root 5 owner 34509 offset 0 found 0 wanted 1 back 0x1424d8e0 backpointer mismatch on [11213131776 16384] owner ref check failed [11213131776 16384] fs tree 260 refs 6 not found unresolved ref root 263 dir 256 index 4 namelen 14 name snap2748615355 error 600 unresolved ref root 267 dir 256 index 4 namelen 14 name snap2748615355 error 600 unresolved ref root 269 dir 256 index 4 namelen 14 name snap2748615355 error 600 unresolved ref root 273 dir 256 index 4 namelen 14 name snap2748615355 error 600 unresolved ref root 274 dir 256 index 4 namelen 14 name snap2748615355 error 600 unresolved ref root 276 dir 256 index 4 namelen 14 name snap2748615355 error 600 I've asked Josef to pull those patches out of btrfs-next, feel free to send me any testing version if you can't reproduce it on your side. david -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 2 Aug 2012 13:46:31 +0200, David Sterba wrote: > Hi, > > appologies for late reply, > > On Thu, Aug 02, 2012 at 12:40:46PM +0800, Miao Xie wrote: >> Changelog v1 -> v2: >> - add comment to explain why we need deal with the delayed items after >> snapshot creation and why this operation do not corrupt the metadata. > > I'm sorry, the comment did not fix the bug :) > > The subvol stress is able to hit this: > > [ 2360.444321] ------------[ cut here ]------------ > [ 2360.448019] kernel BUG at fs/btrfs/extent-tree.c:6047! > [ 2360.448019] invalid opcode: 0000 [#1] SMP > [ 2360.448019] CPU 0 > [ 2360.448019] Modules linked in: btrfs aoe [last unloaded: btrfs] > [ 2360.448019] > [ 2360.448019] Pid: 8212, comm: btrfs Not tainted 3.5.0-default+ #170 Intel Corporation Santa Rosa platform/Matanzas > [ 2360.448019] RIP: 0010:[<ffffffffa00f62a1>] [<ffffffffa00f62a1>] run_clustered_refs+0xa11/0xa20 [btrfs] > [ 2360.448019] RSP: 0018:ffff88003eca1a68 EFLAGS: 00010246 > [ 2360.448019] RAX: 00000000000007ff RBX: ffff880017a694c8 RCX: ffff88003eca1a08 > [ 2360.448019] RDX: ffff880028aa9000 RSI: 00000000000007fe RDI: ffff880064223cf0 > [ 2360.448019] RBP: ffff88003eca1b48 R08: 00000000000007ff R09: ffff88003eca19f8 > [ 2360.448019] R10: ffff88002435d1e8 R11: 0000000000000000 R12: ffff880025d66d28 > [ 2360.448019] R13: ffff880038640000 R14: ffff8800778dfa88 R15: ffff880060f010d0 > [ 2360.448019] FS: 00007f3289f35740(0000) GS:ffff88007dc00000(0000) knlGS:0000000000000000 > [ 2360.448019] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > [ 2360.448019] CR2: ffffffffff600400 CR3: 000000002e112000 CR4: 00000000000007f0 > [ 2360.448019] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 2360.448019] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > [ 2360.448019] Process btrfs (pid: 8212, threadinfo ffff88003eca0000, task ffff88001d834200) > [ 2360.448019] Stack: > [ 2360.448019] 0000000000000000 0000000000000000 0000000000000001 0000000000000000 > [ 2360.448019] 00000000000007ed ffff88002435d1e8 000000003eca1b18 0000000000000000 > [ 2360.448019] 0000000000000770 0000000000000000 000000005cb1e000 ffff88003eca1c08 > [ 2360.448019] Call Trace: > [ 2360.448019] [<ffffffffa00f6479>] btrfs_run_delayed_refs+0x1c9/0x550 [btrfs] > [ 2360.448019] [<ffffffff810a4d15>] ? trace_hardirqs_on_caller+0x155/0x1d0 > [ 2360.448019] [<ffffffffa00e306a>] ? btrfs_free_path+0x2a/0x40 [btrfs] > [ 2360.448019] [<ffffffffa015c741>] ? btrfs_run_delayed_items+0xf1/0x160 [btrfs] > [ 2360.448019] [<ffffffffa0108a15>] btrfs_commit_transaction+0x605/0xb00 [btrfs] > [ 2360.448019] [<ffffffff8109e70d>] ? lock_release_holdtime+0x3d/0x1c0 > [ 2360.448019] [<ffffffffa013fc88>] ? btrfs_mksubvol+0x298/0x360 [btrfs] > [ 2360.448019] [<ffffffff8106d210>] ? wake_up_bit+0x40/0x40 > [ 2360.448019] [<ffffffff8137d88e>] ? do_raw_spin_unlock+0x5e/0xb0 > [ 2360.448019] [<ffffffffa013fd48>] btrfs_mksubvol+0x358/0x360 [btrfs] > [ 2360.448019] [<ffffffffa013fe5a>] btrfs_ioctl_snap_create_transid+0x10a/0x190 [btrfs] > [ 2360.448019] [<ffffffffa014005d>] btrfs_ioctl_snap_create_v2.clone.0+0xfd/0x110 [btrfs] > [ 2360.448019] [<ffffffffa01419ee>] btrfs_ioctl+0x48e/0x1340 [btrfs] > [ 2360.448019] [<ffffffff818f0f00>] ? do_page_fault+0x2d0/0x580 > [ 2360.448019] [<ffffffff818eca70>] ? _raw_spin_unlock_irq+0x30/0x50 > [ 2360.448019] [<ffffffff81078463>] ? finish_task_switch+0x83/0xf0 > [ 2360.448019] [<ffffffff81161d08>] do_vfs_ioctl+0x98/0x560 > [ 2360.448019] [<ffffffff818ed215>] ? retint_swapgs+0x13/0x1b > [ 2360.448019] [<ffffffff8116221f>] sys_ioctl+0x4f/0x80 > [ 2360.448019] [<ffffffff818f56e9>] system_call_fastpath+0x16/0x1b > [ 2360.448019] Code: 8b 76 40 48 89 d7 48 89 55 a0 e8 2b 74 ff ff 83 f8 17 0f 87 1e ff ff ff 0f 0b 80 fa b2 0f 84 b4 f8 ff ff 0f 0b 0f 0b 0f 0b 0f 0b <0f> 0b 0f 0b 0f 0b 0f 0b 0f 1f 80 00 00 00 00 55 48 89 e5 41 57 > [ 2360.448019] RIP [<ffffffffa00f62a1>] run_clustered_refs+0xa11/0xa20 [btrfs] > [ 2360.448019] RSP <ffff88003eca1a68> > [ 2360.814508] ---[ end trace 555a16cac3620ccb ]--- > [ 2360.820398] note: btrfs[8212] exited with preempt_count 1 > [ 2360.827072] BUG: sleeping function called from invalid context at kernel/rwsem.c:20 > [ 2360.836047] in_atomic(): 1, irqs_disabled(): 0, pid: 8212, name: btrfs > [ 2360.843859] INFO: lockdep is turned off. > [ 2360.849021] Pid: 8212, comm: btrfs Tainted: G D 3.5.0-default+ #170 > [ 2360.849022] Call Trace: > [ 2360.849027] [<ffffffff8107a40c>] __might_sleep+0xfc/0x130 > [ 2360.849030] [<ffffffff818ea0f6>] down_read+0x26/0xa0 > [ 2360.849034] [<ffffffff810b416b>] acct_collect+0x4b/0x1b0 > [ 2360.849038] [<ffffffff8104c838>] do_exit+0x718/0x9a0 > [ 2360.849041] [<ffffffff81049a26>] ? kmsg_dump+0x26/0x140 > [ 2360.849043] [<ffffffff818ee0c0>] oops_end+0xb0/0xf0 > [ 2360.849046] [<ffffffff81005a7b>] die+0x5b/0x90 > [ 2360.849048] [<ffffffff818ed9a4>] do_trap+0xc4/0x170 > [ 2360.849052] [<ffffffff810030a5>] do_invalid_op+0x95/0xb0 > [ 2360.849067] [<ffffffffa00f62a1>] ? run_clustered_refs+0xa11/0xa20 [btrfs] > [ 2360.849071] [<ffffffff813779dd>] ? trace_hardirqs_off_thunk+0x3a/0x3c > [ 2360.849073] [<ffffffff818ed260>] ? restore_args+0x30/0x30 > [ 2360.849076] [<ffffffff818f674b>] invalid_op+0x1b/0x20 > [ 2360.849087] [<ffffffffa00f62a1>] ? run_clustered_refs+0xa11/0xa20 [btrfs] > [ 2360.849097] [<ffffffffa00f5f2b>] ? run_clustered_refs+0x69b/0xa20 [btrfs] > [ 2360.849108] [<ffffffffa00f6479>] btrfs_run_delayed_refs+0x1c9/0x550 [btrfs] > [ 2360.849110] [<ffffffff810a4d15>] ? trace_hardirqs_on_caller+0x155/0x1d0 > [ 2360.849119] [<ffffffffa00e306a>] ? btrfs_free_path+0x2a/0x40 [btrfs] > [ 2360.849133] [<ffffffffa015c741>] ? btrfs_run_delayed_items+0xf1/0x160 [btrfs] > [ 2360.849145] [<ffffffffa0108a15>] btrfs_commit_transaction+0x605/0xb00 [btrfs] > [ 2360.849148] [<ffffffff8109e70d>] ? lock_release_holdtime+0x3d/0x1c0 > [ 2360.849161] [<ffffffffa013fc88>] ? btrfs_mksubvol+0x298/0x360 [btrfs] > [ 2360.849164] [<ffffffff8106d210>] ? wake_up_bit+0x40/0x40 > [ 2360.849166] [<ffffffff8137d88e>] ? do_raw_spin_unlock+0x5e/0xb0 > [ 2360.849180] [<ffffffffa013fd48>] btrfs_mksubvol+0x358/0x360 [btrfs] > [ 2360.849194] [<ffffffffa013fe5a>] btrfs_ioctl_snap_create_transid+0x10a/0x190 [btrfs] > [ 2360.849207] [<ffffffffa014005d>] btrfs_ioctl_snap_create_v2.clone.0+0xfd/0x110 [btrfs] > [ 2360.849221] [<ffffffffa01419ee>] btrfs_ioctl+0x48e/0x1340 [btrfs] > [ 2360.849224] [<ffffffff818f0f00>] ? do_page_fault+0x2d0/0x580 > [ 2360.849226] [<ffffffff818eca70>] ? _raw_spin_unlock_irq+0x30/0x50 > [ 2360.849229] [<ffffffff81078463>] ? finish_task_switch+0x83/0xf0 > [ 2360.849231] [<ffffffff81161d08>] do_vfs_ioctl+0x98/0x560 > [ 2360.849234] [<ffffffff818ed215>] ? retint_swapgs+0x13/0x1b > [ 2360.849236] [<ffffffff8116221f>] sys_ioctl+0x4f/0x80 > [ 2360.849239] [<ffffffff818f56e9>] system_call_fastpath+0x16/0x1b > [ 2360.849255] BUG: scheduling while atomic: btrfs/8212/0x10000002 > [ 2360.849256] INFO: lockdep is turned off. > [ 2360.849257] Modules linked in: btrfs aoe [last unloaded: btrfs] > [ 2360.849261] Pid: 8212, comm: btrfs Tainted: G D 3.5.0-default+ #170 > [ 2360.849262] Call Trace: > [ 2360.849262] [<ffffffff81078318>] __schedule_bug+0x68/0x90 > [ 2360.849265] [<ffffffff818eafcc>] __schedule+0x73c/0x810 > [ 2360.849268] [<ffffffff8107b48a>] __cond_resched+0x2a/0x40 > [ 2360.849270] [<ffffffff818eb121>] _cond_resched+0x31/0x40 > [ 2360.849273] [<ffffffff81128e13>] unmap_single_vma+0x493/0x750 > [ 2360.849276] [<ffffffff811100b0>] ? lru_deactivate_fn+0x1e0/0x1e0 > [ 2360.849279] [<ffffffff810a4be0>] ? trace_hardirqs_on_caller+0x20/0x1d0 > [ 2360.849281] [<ffffffff8112986c>] unmap_vmas+0x3c/0x60 > [ 2360.849284] [<ffffffff81130de1>] exit_mmap+0x81/0x140 > [ 2360.849287] [<ffffffff81043824>] mmput+0x74/0x130 > [ 2360.849289] [<ffffffff8104a520>] exit_mm+0x100/0x120 > [ 2360.849292] [<ffffffff8104c858>] do_exit+0x738/0x9a0 > [ 2360.849294] [<ffffffff81049a26>] ? kmsg_dump+0x26/0x140 > [ 2360.849297] [<ffffffff818ee0c0>] oops_end+0xb0/0xf0 > [ 2360.849299] [<ffffffff81005a7b>] die+0x5b/0x90 > [ 2360.849301] [<ffffffff818ed9a4>] do_trap+0xc4/0x170 > [ 2360.849304] [<ffffffff810030a5>] do_invalid_op+0x95/0xb0 > [ 2360.849307] [<ffffffffa00f62a1>] ? run_clustered_refs+0xa11/0xa20 [btrfs] > [ 2360.849317] [<ffffffff813779dd>] ? trace_hardirqs_off_thunk+0x3a/0x3c > [ 2360.849320] [<ffffffff818ed260>] ? restore_args+0x30/0x30 > [ 2360.849322] [<ffffffff818f674b>] invalid_op+0x1b/0x20 > [ 2360.849325] [<ffffffffa00f62a1>] ? run_clustered_refs+0xa11/0xa20 [btrfs] > [ 2360.849335] [<ffffffffa00f5f2b>] ? run_clustered_refs+0x69b/0xa20 [btrfs] > [ 2360.849346] [<ffffffffa00f6479>] btrfs_run_delayed_refs+0x1c9/0x550 [btrfs] > [ 2360.849356] [<ffffffff810a4d15>] ? trace_hardirqs_on_caller+0x155/0x1d0 > [ 2360.849358] [<ffffffffa00e306a>] ? btrfs_free_path+0x2a/0x40 [btrfs] > [ 2360.849367] [<ffffffffa015c741>] ? btrfs_run_delayed_items+0xf1/0x160 [btrfs] > [ 2360.849380] [<ffffffffa0108a15>] btrfs_commit_transaction+0x605/0xb00 [btrfs] > [ 2360.849393] [<ffffffff8109e70d>] ? lock_release_holdtime+0x3d/0x1c0 > [ 2360.849395] [<ffffffffa013fc88>] ? btrfs_mksubvol+0x298/0x360 [btrfs] > [ 2360.849409] [<ffffffff8106d210>] ? wake_up_bit+0x40/0x40 > [ 2360.849411] [<ffffffff8137d88e>] ? do_raw_spin_unlock+0x5e/0xb0 > [ 2360.849413] [<ffffffffa013fd48>] btrfs_mksubvol+0x358/0x360 [btrfs] > [ 2360.849427] [<ffffffffa013fe5a>] btrfs_ioctl_snap_create_transid+0x10a/0x190 [btrfs] > [ 2360.849441] [<ffffffffa014005d>] btrfs_ioctl_snap_create_v2.clone.0+0xfd/0x110 [btrfs] > [ 2360.849455] [<ffffffffa01419ee>] btrfs_ioctl+0x48e/0x1340 [btrfs] > [ 2360.849469] [<ffffffff818f0f00>] ? do_page_fault+0x2d0/0x580 > [ 2360.849471] [<ffffffff818eca70>] ? _raw_spin_unlock_irq+0x30/0x50 > [ 2360.849473] [<ffffffff81078463>] ? finish_task_switch+0x83/0xf0 > [ 2360.849476] [<ffffffff81161d08>] do_vfs_ioctl+0x98/0x560 > [ 2360.849478] [<ffffffff818ed215>] ? retint_swapgs+0x13/0x1b > [ 2360.849481] [<ffffffff8116221f>] sys_ioctl+0x4f/0x80 > [ 2360.849483] [<ffffffff818f56e9>] system_call_fastpath+0x16/0x1b > > fs/btrfs/extent-tree.c:6047 > > 6046 if (parent > 0) { > 6047 BUG_ON(!(flags & BTRFS_BLOCK_FLAG_FULL_BACKREF)); > 6048 btrfs_set_extent_inline_ref_type(leaf, iref, > 6049 BTRFS_SHARED_BLOCK_REF_KEY); > 6050 btrfs_set_extent_inline_ref_offset(leaf, iref, parent); > 6051 } else { > 6052 btrfs_set_extent_inline_ref_type(leaf, iref, > 6053 BTRFS_TREE_BLOCK_REF_KEY); > 6054 btrfs_set_extent_inline_ref_offset(leaf, iref, root_objectid); > 6055 } This bug is similar to the one which is reported by Daniel J Blueman a month ago. And Josef have fixed it, but the patch has not been merged into for-linus branch till now. Did you applied that patch? > > Currently for-linux hangs early during the test, so I applied V3 patches on top > of 3.5. > > The filesystem is freshly created, the load is to simultaneously unpack large tar, > snapshot the fs, delete random snapshot, looped rm of the untarred dir. Crashes after > some minutes, reliably. Could you send the test tool to me? I want to look into it. Thanks Miao > > Fsck spits lots of errors: > > ref mismatch on [1133031424 4096] extent item 1, found 0 > Backref 1133031424 root 5 not referenced back 0x7d1f40 > Incorrect global backref count on 1133031424 found 1 wanted 0 > backpointer mismatch on [1133031424 4096] > owner ref check failed [1133031424 4096] > > ref mismatch on [11213131776 16384] extent item 1, found 0 > Incorrect local backref count on 11213131776 root 5 owner 34509 offset 0 found 0 wanted 1 back 0x1424d8e0 > backpointer mismatch on [11213131776 16384] > owner ref check failed [11213131776 16384] > > fs tree 260 refs 6 not found > unresolved ref root 263 dir 256 index 4 namelen 14 name snap2748615355 error 600 > unresolved ref root 267 dir 256 index 4 namelen 14 name snap2748615355 error 600 > unresolved ref root 269 dir 256 index 4 namelen 14 name snap2748615355 error 600 > unresolved ref root 273 dir 256 index 4 namelen 14 name snap2748615355 error 600 > unresolved ref root 274 dir 256 index 4 namelen 14 name snap2748615355 error 600 > unresolved ref root 276 dir 256 index 4 namelen 14 name snap2748615355 error 600 > > > I've asked Josef to pull those patches out of btrfs-next, feel free to send me any testing > version if you can't reproduce it on your side. > > > david > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Aug 02, 2012 at 07:53:36PM -0600, Miao Xie wrote: > On Thu, 2 Aug 2012 13:46:31 +0200, David Sterba wrote: > > Hi, > > > > appologies for late reply, > > > > On Thu, Aug 02, 2012 at 12:40:46PM +0800, Miao Xie wrote: > >> Changelog v1 -> v2: > >> - add comment to explain why we need deal with the delayed items after > >> snapshot creation and why this operation do not corrupt the metadata. > > > > I'm sorry, the comment did not fix the bug :) > > > > The subvol stress is able to hit this: > > > > [ 2360.444321] ------------[ cut here ]------------ > > [ 2360.448019] kernel BUG at fs/btrfs/extent-tree.c:6047! > > [ 2360.448019] invalid opcode: 0000 [#1] SMP > > [ 2360.448019] CPU 0 > > [ 2360.448019] Modules linked in: btrfs aoe [last unloaded: btrfs] > > [ 2360.448019] > > [ 2360.448019] Pid: 8212, comm: btrfs Not tainted 3.5.0-default+ #170 Intel Corporation Santa Rosa platform/Matanzas > > [ 2360.448019] RIP: 0010:[<ffffffffa00f62a1>] [<ffffffffa00f62a1>] run_clustered_refs+0xa11/0xa20 [btrfs] > > [ 2360.448019] RSP: 0018:ffff88003eca1a68 EFLAGS: 00010246 > > [ 2360.448019] RAX: 00000000000007ff RBX: ffff880017a694c8 RCX: ffff88003eca1a08 > > [ 2360.448019] RDX: ffff880028aa9000 RSI: 00000000000007fe RDI: ffff880064223cf0 > > [ 2360.448019] RBP: ffff88003eca1b48 R08: 00000000000007ff R09: ffff88003eca19f8 > > [ 2360.448019] R10: ffff88002435d1e8 R11: 0000000000000000 R12: ffff880025d66d28 > > [ 2360.448019] R13: ffff880038640000 R14: ffff8800778dfa88 R15: ffff880060f010d0 > > [ 2360.448019] FS: 00007f3289f35740(0000) GS:ffff88007dc00000(0000) knlGS:0000000000000000 > > [ 2360.448019] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > > [ 2360.448019] CR2: ffffffffff600400 CR3: 000000002e112000 CR4: 00000000000007f0 > > [ 2360.448019] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > [ 2360.448019] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > > [ 2360.448019] Process btrfs (pid: 8212, threadinfo ffff88003eca0000, task ffff88001d834200) > > [ 2360.448019] Stack: > > [ 2360.448019] 0000000000000000 0000000000000000 0000000000000001 0000000000000000 > > [ 2360.448019] 00000000000007ed ffff88002435d1e8 000000003eca1b18 0000000000000000 > > [ 2360.448019] 0000000000000770 0000000000000000 000000005cb1e000 ffff88003eca1c08 > > [ 2360.448019] Call Trace: > > [ 2360.448019] [<ffffffffa00f6479>] btrfs_run_delayed_refs+0x1c9/0x550 [btrfs] > > [ 2360.448019] [<ffffffff810a4d15>] ? trace_hardirqs_on_caller+0x155/0x1d0 > > [ 2360.448019] [<ffffffffa00e306a>] ? btrfs_free_path+0x2a/0x40 [btrfs] > > [ 2360.448019] [<ffffffffa015c741>] ? btrfs_run_delayed_items+0xf1/0x160 [btrfs] > > [ 2360.448019] [<ffffffffa0108a15>] btrfs_commit_transaction+0x605/0xb00 [btrfs] > > [ 2360.448019] [<ffffffff8109e70d>] ? lock_release_holdtime+0x3d/0x1c0 > > [ 2360.448019] [<ffffffffa013fc88>] ? btrfs_mksubvol+0x298/0x360 [btrfs] > > [ 2360.448019] [<ffffffff8106d210>] ? wake_up_bit+0x40/0x40 > > [ 2360.448019] [<ffffffff8137d88e>] ? do_raw_spin_unlock+0x5e/0xb0 > > [ 2360.448019] [<ffffffffa013fd48>] btrfs_mksubvol+0x358/0x360 [btrfs] > > [ 2360.448019] [<ffffffffa013fe5a>] btrfs_ioctl_snap_create_transid+0x10a/0x190 [btrfs] > > [ 2360.448019] [<ffffffffa014005d>] btrfs_ioctl_snap_create_v2.clone.0+0xfd/0x110 [btrfs] > > [ 2360.448019] [<ffffffffa01419ee>] btrfs_ioctl+0x48e/0x1340 [btrfs] > > [ 2360.448019] [<ffffffff818f0f00>] ? do_page_fault+0x2d0/0x580 > > [ 2360.448019] [<ffffffff818eca70>] ? _raw_spin_unlock_irq+0x30/0x50 > > [ 2360.448019] [<ffffffff81078463>] ? finish_task_switch+0x83/0xf0 > > [ 2360.448019] [<ffffffff81161d08>] do_vfs_ioctl+0x98/0x560 > > [ 2360.448019] [<ffffffff818ed215>] ? retint_swapgs+0x13/0x1b > > [ 2360.448019] [<ffffffff8116221f>] sys_ioctl+0x4f/0x80 > > [ 2360.448019] [<ffffffff818f56e9>] system_call_fastpath+0x16/0x1b > > [ 2360.448019] Code: 8b 76 40 48 89 d7 48 89 55 a0 e8 2b 74 ff ff 83 f8 17 0f 87 1e ff ff ff 0f 0b 80 fa b2 0f 84 b4 f8 ff ff 0f 0b 0f 0b 0f 0b 0f 0b <0f> 0b 0f 0b 0f 0b 0f 0b 0f 1f 80 00 00 00 00 55 48 89 e5 41 57 > > [ 2360.448019] RIP [<ffffffffa00f62a1>] run_clustered_refs+0xa11/0xa20 [btrfs] > > [ 2360.448019] RSP <ffff88003eca1a68> > > [ 2360.814508] ---[ end trace 555a16cac3620ccb ]--- > > [ 2360.820398] note: btrfs[8212] exited with preempt_count 1 > > [ 2360.827072] BUG: sleeping function called from invalid context at kernel/rwsem.c:20 > > [ 2360.836047] in_atomic(): 1, irqs_disabled(): 0, pid: 8212, name: btrfs > > [ 2360.843859] INFO: lockdep is turned off. > > [ 2360.849021] Pid: 8212, comm: btrfs Tainted: G D 3.5.0-default+ #170 > > [ 2360.849022] Call Trace: > > [ 2360.849027] [<ffffffff8107a40c>] __might_sleep+0xfc/0x130 > > [ 2360.849030] [<ffffffff818ea0f6>] down_read+0x26/0xa0 > > [ 2360.849034] [<ffffffff810b416b>] acct_collect+0x4b/0x1b0 > > [ 2360.849038] [<ffffffff8104c838>] do_exit+0x718/0x9a0 > > [ 2360.849041] [<ffffffff81049a26>] ? kmsg_dump+0x26/0x140 > > [ 2360.849043] [<ffffffff818ee0c0>] oops_end+0xb0/0xf0 > > [ 2360.849046] [<ffffffff81005a7b>] die+0x5b/0x90 > > [ 2360.849048] [<ffffffff818ed9a4>] do_trap+0xc4/0x170 > > [ 2360.849052] [<ffffffff810030a5>] do_invalid_op+0x95/0xb0 > > [ 2360.849067] [<ffffffffa00f62a1>] ? run_clustered_refs+0xa11/0xa20 [btrfs] > > [ 2360.849071] [<ffffffff813779dd>] ? trace_hardirqs_off_thunk+0x3a/0x3c > > [ 2360.849073] [<ffffffff818ed260>] ? restore_args+0x30/0x30 > > [ 2360.849076] [<ffffffff818f674b>] invalid_op+0x1b/0x20 > > [ 2360.849087] [<ffffffffa00f62a1>] ? run_clustered_refs+0xa11/0xa20 [btrfs] > > [ 2360.849097] [<ffffffffa00f5f2b>] ? run_clustered_refs+0x69b/0xa20 [btrfs] > > [ 2360.849108] [<ffffffffa00f6479>] btrfs_run_delayed_refs+0x1c9/0x550 [btrfs] > > [ 2360.849110] [<ffffffff810a4d15>] ? trace_hardirqs_on_caller+0x155/0x1d0 > > [ 2360.849119] [<ffffffffa00e306a>] ? btrfs_free_path+0x2a/0x40 [btrfs] > > [ 2360.849133] [<ffffffffa015c741>] ? btrfs_run_delayed_items+0xf1/0x160 [btrfs] > > [ 2360.849145] [<ffffffffa0108a15>] btrfs_commit_transaction+0x605/0xb00 [btrfs] > > [ 2360.849148] [<ffffffff8109e70d>] ? lock_release_holdtime+0x3d/0x1c0 > > [ 2360.849161] [<ffffffffa013fc88>] ? btrfs_mksubvol+0x298/0x360 [btrfs] > > [ 2360.849164] [<ffffffff8106d210>] ? wake_up_bit+0x40/0x40 > > [ 2360.849166] [<ffffffff8137d88e>] ? do_raw_spin_unlock+0x5e/0xb0 > > [ 2360.849180] [<ffffffffa013fd48>] btrfs_mksubvol+0x358/0x360 [btrfs] > > [ 2360.849194] [<ffffffffa013fe5a>] btrfs_ioctl_snap_create_transid+0x10a/0x190 [btrfs] > > [ 2360.849207] [<ffffffffa014005d>] btrfs_ioctl_snap_create_v2.clone.0+0xfd/0x110 [btrfs] > > [ 2360.849221] [<ffffffffa01419ee>] btrfs_ioctl+0x48e/0x1340 [btrfs] > > [ 2360.849224] [<ffffffff818f0f00>] ? do_page_fault+0x2d0/0x580 > > [ 2360.849226] [<ffffffff818eca70>] ? _raw_spin_unlock_irq+0x30/0x50 > > [ 2360.849229] [<ffffffff81078463>] ? finish_task_switch+0x83/0xf0 > > [ 2360.849231] [<ffffffff81161d08>] do_vfs_ioctl+0x98/0x560 > > [ 2360.849234] [<ffffffff818ed215>] ? retint_swapgs+0x13/0x1b > > [ 2360.849236] [<ffffffff8116221f>] sys_ioctl+0x4f/0x80 > > [ 2360.849239] [<ffffffff818f56e9>] system_call_fastpath+0x16/0x1b > > [ 2360.849255] BUG: scheduling while atomic: btrfs/8212/0x10000002 > > [ 2360.849256] INFO: lockdep is turned off. > > [ 2360.849257] Modules linked in: btrfs aoe [last unloaded: btrfs] > > [ 2360.849261] Pid: 8212, comm: btrfs Tainted: G D 3.5.0-default+ #170 > > [ 2360.849262] Call Trace: > > [ 2360.849262] [<ffffffff81078318>] __schedule_bug+0x68/0x90 > > [ 2360.849265] [<ffffffff818eafcc>] __schedule+0x73c/0x810 > > [ 2360.849268] [<ffffffff8107b48a>] __cond_resched+0x2a/0x40 > > [ 2360.849270] [<ffffffff818eb121>] _cond_resched+0x31/0x40 > > [ 2360.849273] [<ffffffff81128e13>] unmap_single_vma+0x493/0x750 > > [ 2360.849276] [<ffffffff811100b0>] ? lru_deactivate_fn+0x1e0/0x1e0 > > [ 2360.849279] [<ffffffff810a4be0>] ? trace_hardirqs_on_caller+0x20/0x1d0 > > [ 2360.849281] [<ffffffff8112986c>] unmap_vmas+0x3c/0x60 > > [ 2360.849284] [<ffffffff81130de1>] exit_mmap+0x81/0x140 > > [ 2360.849287] [<ffffffff81043824>] mmput+0x74/0x130 > > [ 2360.849289] [<ffffffff8104a520>] exit_mm+0x100/0x120 > > [ 2360.849292] [<ffffffff8104c858>] do_exit+0x738/0x9a0 > > [ 2360.849294] [<ffffffff81049a26>] ? kmsg_dump+0x26/0x140 > > [ 2360.849297] [<ffffffff818ee0c0>] oops_end+0xb0/0xf0 > > [ 2360.849299] [<ffffffff81005a7b>] die+0x5b/0x90 > > [ 2360.849301] [<ffffffff818ed9a4>] do_trap+0xc4/0x170 > > [ 2360.849304] [<ffffffff810030a5>] do_invalid_op+0x95/0xb0 > > [ 2360.849307] [<ffffffffa00f62a1>] ? run_clustered_refs+0xa11/0xa20 [btrfs] > > [ 2360.849317] [<ffffffff813779dd>] ? trace_hardirqs_off_thunk+0x3a/0x3c > > [ 2360.849320] [<ffffffff818ed260>] ? restore_args+0x30/0x30 > > [ 2360.849322] [<ffffffff818f674b>] invalid_op+0x1b/0x20 > > [ 2360.849325] [<ffffffffa00f62a1>] ? run_clustered_refs+0xa11/0xa20 [btrfs] > > [ 2360.849335] [<ffffffffa00f5f2b>] ? run_clustered_refs+0x69b/0xa20 [btrfs] > > [ 2360.849346] [<ffffffffa00f6479>] btrfs_run_delayed_refs+0x1c9/0x550 [btrfs] > > [ 2360.849356] [<ffffffff810a4d15>] ? trace_hardirqs_on_caller+0x155/0x1d0 > > [ 2360.849358] [<ffffffffa00e306a>] ? btrfs_free_path+0x2a/0x40 [btrfs] > > [ 2360.849367] [<ffffffffa015c741>] ? btrfs_run_delayed_items+0xf1/0x160 [btrfs] > > [ 2360.849380] [<ffffffffa0108a15>] btrfs_commit_transaction+0x605/0xb00 [btrfs] > > [ 2360.849393] [<ffffffff8109e70d>] ? lock_release_holdtime+0x3d/0x1c0 > > [ 2360.849395] [<ffffffffa013fc88>] ? btrfs_mksubvol+0x298/0x360 [btrfs] > > [ 2360.849409] [<ffffffff8106d210>] ? wake_up_bit+0x40/0x40 > > [ 2360.849411] [<ffffffff8137d88e>] ? do_raw_spin_unlock+0x5e/0xb0 > > [ 2360.849413] [<ffffffffa013fd48>] btrfs_mksubvol+0x358/0x360 [btrfs] > > [ 2360.849427] [<ffffffffa013fe5a>] btrfs_ioctl_snap_create_transid+0x10a/0x190 [btrfs] > > [ 2360.849441] [<ffffffffa014005d>] btrfs_ioctl_snap_create_v2.clone.0+0xfd/0x110 [btrfs] > > [ 2360.849455] [<ffffffffa01419ee>] btrfs_ioctl+0x48e/0x1340 [btrfs] > > [ 2360.849469] [<ffffffff818f0f00>] ? do_page_fault+0x2d0/0x580 > > [ 2360.849471] [<ffffffff818eca70>] ? _raw_spin_unlock_irq+0x30/0x50 > > [ 2360.849473] [<ffffffff81078463>] ? finish_task_switch+0x83/0xf0 > > [ 2360.849476] [<ffffffff81161d08>] do_vfs_ioctl+0x98/0x560 > > [ 2360.849478] [<ffffffff818ed215>] ? retint_swapgs+0x13/0x1b > > [ 2360.849481] [<ffffffff8116221f>] sys_ioctl+0x4f/0x80 > > [ 2360.849483] [<ffffffff818f56e9>] system_call_fastpath+0x16/0x1b > > > > fs/btrfs/extent-tree.c:6047 > > > > 6046 if (parent > 0) { > > 6047 BUG_ON(!(flags & BTRFS_BLOCK_FLAG_FULL_BACKREF)); > > 6048 btrfs_set_extent_inline_ref_type(leaf, iref, > > 6049 BTRFS_SHARED_BLOCK_REF_KEY); > > 6050 btrfs_set_extent_inline_ref_offset(leaf, iref, parent); > > 6051 } else { > > 6052 btrfs_set_extent_inline_ref_type(leaf, iref, > > 6053 BTRFS_TREE_BLOCK_REF_KEY); > > 6054 btrfs_set_extent_inline_ref_offset(leaf, iref, root_objectid); > > 6055 } > > This bug is similar to the one which is reported by Daniel J Blueman a month ago. And > Josef have fixed it, but the patch has not been merged into for-linus branch till now. > Did you applied that patch? > What patch is this? Thanks, Josef -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On fri, 3 Aug 2012 17:03:01 -0400, Josef Bacik wrote: > On Thu, Aug 02, 2012 at 07:53:36PM -0600, Miao Xie wrote: >> On Thu, 2 Aug 2012 13:46:31 +0200, David Sterba wrote: >>> Hi, >>> >>> appologies for late reply, >>> >>> On Thu, Aug 02, 2012 at 12:40:46PM +0800, Miao Xie wrote: >>>> Changelog v1 -> v2: >>>> - add comment to explain why we need deal with the delayed items after >>>> snapshot creation and why this operation do not corrupt the metadata. >>> >>> I'm sorry, the comment did not fix the bug :) >>> >>> The subvol stress is able to hit this: >>> >>> [ 2360.444321] ------------[ cut here ]------------ >>> [ 2360.448019] kernel BUG at fs/btrfs/extent-tree.c:6047! >>> [ 2360.448019] invalid opcode: 0000 [#1] SMP >>> [ 2360.448019] CPU 0 >>> [ 2360.448019] Modules linked in: btrfs aoe [last unloaded: btrfs] >>> [ 2360.448019] >>> [ 2360.448019] Pid: 8212, comm: btrfs Not tainted 3.5.0-default+ #170 Intel Corporation Santa Rosa platform/Matanzas >>> [ 2360.448019] RIP: 0010:[<ffffffffa00f62a1>] [<ffffffffa00f62a1>] run_clustered_refs+0xa11/0xa20 [btrfs] >>> [ 2360.448019] RSP: 0018:ffff88003eca1a68 EFLAGS: 00010246 >>> [ 2360.448019] RAX: 00000000000007ff RBX: ffff880017a694c8 RCX: ffff88003eca1a08 >>> [ 2360.448019] RDX: ffff880028aa9000 RSI: 00000000000007fe RDI: ffff880064223cf0 >>> [ 2360.448019] RBP: ffff88003eca1b48 R08: 00000000000007ff R09: ffff88003eca19f8 >>> [ 2360.448019] R10: ffff88002435d1e8 R11: 0000000000000000 R12: ffff880025d66d28 >>> [ 2360.448019] R13: ffff880038640000 R14: ffff8800778dfa88 R15: ffff880060f010d0 >>> [ 2360.448019] FS: 00007f3289f35740(0000) GS:ffff88007dc00000(0000) knlGS:0000000000000000 >>> [ 2360.448019] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b >>> [ 2360.448019] CR2: ffffffffff600400 CR3: 000000002e112000 CR4: 00000000000007f0 >>> [ 2360.448019] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >>> [ 2360.448019] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 >>> [ 2360.448019] Process btrfs (pid: 8212, threadinfo ffff88003eca0000, task ffff88001d834200) >>> [ 2360.448019] Stack: >>> [ 2360.448019] 0000000000000000 0000000000000000 0000000000000001 0000000000000000 >>> [ 2360.448019] 00000000000007ed ffff88002435d1e8 000000003eca1b18 0000000000000000 >>> [ 2360.448019] 0000000000000770 0000000000000000 000000005cb1e000 ffff88003eca1c08 >>> [ 2360.448019] Call Trace: >>> [ 2360.448019] [<ffffffffa00f6479>] btrfs_run_delayed_refs+0x1c9/0x550 [btrfs] >>> [ 2360.448019] [<ffffffff810a4d15>] ? trace_hardirqs_on_caller+0x155/0x1d0 >>> [ 2360.448019] [<ffffffffa00e306a>] ? btrfs_free_path+0x2a/0x40 [btrfs] >>> [ 2360.448019] [<ffffffffa015c741>] ? btrfs_run_delayed_items+0xf1/0x160 [btrfs] >>> [ 2360.448019] [<ffffffffa0108a15>] btrfs_commit_transaction+0x605/0xb00 [btrfs] >>> [ 2360.448019] [<ffffffff8109e70d>] ? lock_release_holdtime+0x3d/0x1c0 >>> [ 2360.448019] [<ffffffffa013fc88>] ? btrfs_mksubvol+0x298/0x360 [btrfs] >>> [ 2360.448019] [<ffffffff8106d210>] ? wake_up_bit+0x40/0x40 >>> [ 2360.448019] [<ffffffff8137d88e>] ? do_raw_spin_unlock+0x5e/0xb0 >>> [ 2360.448019] [<ffffffffa013fd48>] btrfs_mksubvol+0x358/0x360 [btrfs] >>> [ 2360.448019] [<ffffffffa013fe5a>] btrfs_ioctl_snap_create_transid+0x10a/0x190 [btrfs] >>> [ 2360.448019] [<ffffffffa014005d>] btrfs_ioctl_snap_create_v2.clone.0+0xfd/0x110 [btrfs] >>> [ 2360.448019] [<ffffffffa01419ee>] btrfs_ioctl+0x48e/0x1340 [btrfs] >>> [ 2360.448019] [<ffffffff818f0f00>] ? do_page_fault+0x2d0/0x580 >>> [ 2360.448019] [<ffffffff818eca70>] ? _raw_spin_unlock_irq+0x30/0x50 >>> [ 2360.448019] [<ffffffff81078463>] ? finish_task_switch+0x83/0xf0 >>> [ 2360.448019] [<ffffffff81161d08>] do_vfs_ioctl+0x98/0x560 >>> [ 2360.448019] [<ffffffff818ed215>] ? retint_swapgs+0x13/0x1b >>> [ 2360.448019] [<ffffffff8116221f>] sys_ioctl+0x4f/0x80 >>> [ 2360.448019] [<ffffffff818f56e9>] system_call_fastpath+0x16/0x1b >>> [ 2360.448019] Code: 8b 76 40 48 89 d7 48 89 55 a0 e8 2b 74 ff ff 83 f8 17 0f 87 1e ff ff ff 0f 0b 80 fa b2 0f 84 b4 f8 ff ff 0f 0b 0f 0b 0f 0b 0f 0b <0f> 0b 0f 0b 0f 0b 0f 0b 0f 1f 80 00 00 00 00 55 48 89 e5 41 57 >>> [ 2360.448019] RIP [<ffffffffa00f62a1>] run_clustered_refs+0xa11/0xa20 [btrfs] >>> [ 2360.448019] RSP <ffff88003eca1a68> >>> [ 2360.814508] ---[ end trace 555a16cac3620ccb ]--- >>> [ 2360.820398] note: btrfs[8212] exited with preempt_count 1 >>> [ 2360.827072] BUG: sleeping function called from invalid context at kernel/rwsem.c:20 >>> [ 2360.836047] in_atomic(): 1, irqs_disabled(): 0, pid: 8212, name: btrfs >>> [ 2360.843859] INFO: lockdep is turned off. >>> [ 2360.849021] Pid: 8212, comm: btrfs Tainted: G D 3.5.0-default+ #170 >>> [ 2360.849022] Call Trace: >>> [ 2360.849027] [<ffffffff8107a40c>] __might_sleep+0xfc/0x130 >>> [ 2360.849030] [<ffffffff818ea0f6>] down_read+0x26/0xa0 >>> [ 2360.849034] [<ffffffff810b416b>] acct_collect+0x4b/0x1b0 >>> [ 2360.849038] [<ffffffff8104c838>] do_exit+0x718/0x9a0 >>> [ 2360.849041] [<ffffffff81049a26>] ? kmsg_dump+0x26/0x140 >>> [ 2360.849043] [<ffffffff818ee0c0>] oops_end+0xb0/0xf0 >>> [ 2360.849046] [<ffffffff81005a7b>] die+0x5b/0x90 >>> [ 2360.849048] [<ffffffff818ed9a4>] do_trap+0xc4/0x170 >>> [ 2360.849052] [<ffffffff810030a5>] do_invalid_op+0x95/0xb0 >>> [ 2360.849067] [<ffffffffa00f62a1>] ? run_clustered_refs+0xa11/0xa20 [btrfs] >>> [ 2360.849071] [<ffffffff813779dd>] ? trace_hardirqs_off_thunk+0x3a/0x3c >>> [ 2360.849073] [<ffffffff818ed260>] ? restore_args+0x30/0x30 >>> [ 2360.849076] [<ffffffff818f674b>] invalid_op+0x1b/0x20 >>> [ 2360.849087] [<ffffffffa00f62a1>] ? run_clustered_refs+0xa11/0xa20 [btrfs] >>> [ 2360.849097] [<ffffffffa00f5f2b>] ? run_clustered_refs+0x69b/0xa20 [btrfs] >>> [ 2360.849108] [<ffffffffa00f6479>] btrfs_run_delayed_refs+0x1c9/0x550 [btrfs] >>> [ 2360.849110] [<ffffffff810a4d15>] ? trace_hardirqs_on_caller+0x155/0x1d0 >>> [ 2360.849119] [<ffffffffa00e306a>] ? btrfs_free_path+0x2a/0x40 [btrfs] >>> [ 2360.849133] [<ffffffffa015c741>] ? btrfs_run_delayed_items+0xf1/0x160 [btrfs] >>> [ 2360.849145] [<ffffffffa0108a15>] btrfs_commit_transaction+0x605/0xb00 [btrfs] >>> [ 2360.849148] [<ffffffff8109e70d>] ? lock_release_holdtime+0x3d/0x1c0 >>> [ 2360.849161] [<ffffffffa013fc88>] ? btrfs_mksubvol+0x298/0x360 [btrfs] >>> [ 2360.849164] [<ffffffff8106d210>] ? wake_up_bit+0x40/0x40 >>> [ 2360.849166] [<ffffffff8137d88e>] ? do_raw_spin_unlock+0x5e/0xb0 >>> [ 2360.849180] [<ffffffffa013fd48>] btrfs_mksubvol+0x358/0x360 [btrfs] >>> [ 2360.849194] [<ffffffffa013fe5a>] btrfs_ioctl_snap_create_transid+0x10a/0x190 [btrfs] >>> [ 2360.849207] [<ffffffffa014005d>] btrfs_ioctl_snap_create_v2.clone.0+0xfd/0x110 [btrfs] >>> [ 2360.849221] [<ffffffffa01419ee>] btrfs_ioctl+0x48e/0x1340 [btrfs] >>> [ 2360.849224] [<ffffffff818f0f00>] ? do_page_fault+0x2d0/0x580 >>> [ 2360.849226] [<ffffffff818eca70>] ? _raw_spin_unlock_irq+0x30/0x50 >>> [ 2360.849229] [<ffffffff81078463>] ? finish_task_switch+0x83/0xf0 >>> [ 2360.849231] [<ffffffff81161d08>] do_vfs_ioctl+0x98/0x560 >>> [ 2360.849234] [<ffffffff818ed215>] ? retint_swapgs+0x13/0x1b >>> [ 2360.849236] [<ffffffff8116221f>] sys_ioctl+0x4f/0x80 >>> [ 2360.849239] [<ffffffff818f56e9>] system_call_fastpath+0x16/0x1b >>> [ 2360.849255] BUG: scheduling while atomic: btrfs/8212/0x10000002 >>> [ 2360.849256] INFO: lockdep is turned off. >>> [ 2360.849257] Modules linked in: btrfs aoe [last unloaded: btrfs] >>> [ 2360.849261] Pid: 8212, comm: btrfs Tainted: G D 3.5.0-default+ #170 >>> [ 2360.849262] Call Trace: >>> [ 2360.849262] [<ffffffff81078318>] __schedule_bug+0x68/0x90 >>> [ 2360.849265] [<ffffffff818eafcc>] __schedule+0x73c/0x810 >>> [ 2360.849268] [<ffffffff8107b48a>] __cond_resched+0x2a/0x40 >>> [ 2360.849270] [<ffffffff818eb121>] _cond_resched+0x31/0x40 >>> [ 2360.849273] [<ffffffff81128e13>] unmap_single_vma+0x493/0x750 >>> [ 2360.849276] [<ffffffff811100b0>] ? lru_deactivate_fn+0x1e0/0x1e0 >>> [ 2360.849279] [<ffffffff810a4be0>] ? trace_hardirqs_on_caller+0x20/0x1d0 >>> [ 2360.849281] [<ffffffff8112986c>] unmap_vmas+0x3c/0x60 >>> [ 2360.849284] [<ffffffff81130de1>] exit_mmap+0x81/0x140 >>> [ 2360.849287] [<ffffffff81043824>] mmput+0x74/0x130 >>> [ 2360.849289] [<ffffffff8104a520>] exit_mm+0x100/0x120 >>> [ 2360.849292] [<ffffffff8104c858>] do_exit+0x738/0x9a0 >>> [ 2360.849294] [<ffffffff81049a26>] ? kmsg_dump+0x26/0x140 >>> [ 2360.849297] [<ffffffff818ee0c0>] oops_end+0xb0/0xf0 >>> [ 2360.849299] [<ffffffff81005a7b>] die+0x5b/0x90 >>> [ 2360.849301] [<ffffffff818ed9a4>] do_trap+0xc4/0x170 >>> [ 2360.849304] [<ffffffff810030a5>] do_invalid_op+0x95/0xb0 >>> [ 2360.849307] [<ffffffffa00f62a1>] ? run_clustered_refs+0xa11/0xa20 [btrfs] >>> [ 2360.849317] [<ffffffff813779dd>] ? trace_hardirqs_off_thunk+0x3a/0x3c >>> [ 2360.849320] [<ffffffff818ed260>] ? restore_args+0x30/0x30 >>> [ 2360.849322] [<ffffffff818f674b>] invalid_op+0x1b/0x20 >>> [ 2360.849325] [<ffffffffa00f62a1>] ? run_clustered_refs+0xa11/0xa20 [btrfs] >>> [ 2360.849335] [<ffffffffa00f5f2b>] ? run_clustered_refs+0x69b/0xa20 [btrfs] >>> [ 2360.849346] [<ffffffffa00f6479>] btrfs_run_delayed_refs+0x1c9/0x550 [btrfs] >>> [ 2360.849356] [<ffffffff810a4d15>] ? trace_hardirqs_on_caller+0x155/0x1d0 >>> [ 2360.849358] [<ffffffffa00e306a>] ? btrfs_free_path+0x2a/0x40 [btrfs] >>> [ 2360.849367] [<ffffffffa015c741>] ? btrfs_run_delayed_items+0xf1/0x160 [btrfs] >>> [ 2360.849380] [<ffffffffa0108a15>] btrfs_commit_transaction+0x605/0xb00 [btrfs] >>> [ 2360.849393] [<ffffffff8109e70d>] ? lock_release_holdtime+0x3d/0x1c0 >>> [ 2360.849395] [<ffffffffa013fc88>] ? btrfs_mksubvol+0x298/0x360 [btrfs] >>> [ 2360.849409] [<ffffffff8106d210>] ? wake_up_bit+0x40/0x40 >>> [ 2360.849411] [<ffffffff8137d88e>] ? do_raw_spin_unlock+0x5e/0xb0 >>> [ 2360.849413] [<ffffffffa013fd48>] btrfs_mksubvol+0x358/0x360 [btrfs] >>> [ 2360.849427] [<ffffffffa013fe5a>] btrfs_ioctl_snap_create_transid+0x10a/0x190 [btrfs] >>> [ 2360.849441] [<ffffffffa014005d>] btrfs_ioctl_snap_create_v2.clone.0+0xfd/0x110 [btrfs] >>> [ 2360.849455] [<ffffffffa01419ee>] btrfs_ioctl+0x48e/0x1340 [btrfs] >>> [ 2360.849469] [<ffffffff818f0f00>] ? do_page_fault+0x2d0/0x580 >>> [ 2360.849471] [<ffffffff818eca70>] ? _raw_spin_unlock_irq+0x30/0x50 >>> [ 2360.849473] [<ffffffff81078463>] ? finish_task_switch+0x83/0xf0 >>> [ 2360.849476] [<ffffffff81161d08>] do_vfs_ioctl+0x98/0x560 >>> [ 2360.849478] [<ffffffff818ed215>] ? retint_swapgs+0x13/0x1b >>> [ 2360.849481] [<ffffffff8116221f>] sys_ioctl+0x4f/0x80 >>> [ 2360.849483] [<ffffffff818f56e9>] system_call_fastpath+0x16/0x1b >>> >>> fs/btrfs/extent-tree.c:6047 >>> >>> 6046 if (parent > 0) { >>> 6047 BUG_ON(!(flags & BTRFS_BLOCK_FLAG_FULL_BACKREF)); >>> 6048 btrfs_set_extent_inline_ref_type(leaf, iref, >>> 6049 BTRFS_SHARED_BLOCK_REF_KEY); >>> 6050 btrfs_set_extent_inline_ref_offset(leaf, iref, parent); >>> 6051 } else { >>> 6052 btrfs_set_extent_inline_ref_type(leaf, iref, >>> 6053 BTRFS_TREE_BLOCK_REF_KEY); >>> 6054 btrfs_set_extent_inline_ref_offset(leaf, iref, root_objectid); >>> 6055 } >> >> This bug is similar to the one which is reported by Daniel J Blueman a month ago. And >> Josef have fixed it, but the patch has not been merged into for-linus branch till now. >> Did you applied that patch? >> > > What patch is this? Thanks, http://marc.info/?l=linux-btrfs&m=134227134622271&w=2 URL of the bug report is: http://marc.info/?l=linux-btrfs&m=134120032905388&w=2 But I'm not sure these two bugs is the same, so I need the test tool of David to look into it. Thanks Miao -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sat, Aug 04, 2012 at 01:53:28PM +0800, Miao Xie wrote: > But I'm not sure these two bugs is the same, so I need the test tool > of David to look into it. Attached. It's a set of scripts and has a few assumptions hardcoded, like where the tar srouce is and the name of extracted directory, so it'll need a few tweaks. Also the actions are started inside a tmux session for convenience. The expected stress load is to generate lots of files and directories, snapshot create and delete and rm of the untarred directory. The 'rm' step is delayed a few minutes so the tar generates enough data. On some hosts the rm phase is fast and will clean the untar directory too quickly. david
On Wed, 8 Aug 2012 15:38:41 +0200, David Sterba wrote: > On Sat, Aug 04, 2012 at 01:53:28PM +0800, Miao Xie wrote: >> But I'm not sure these two bugs is the same, so I need the test tool >> of David to look into it. > > Attached. It's a set of scripts and has a few assumptions hardcoded, > like where the tar srouce is and the name of extracted directory, so > it'll need a few tweaks. Also the actions are started inside a tmux > session for convenience. > > The expected stress load is to generate lots of files and directories, > snapshot create and delete and rm of the untarred directory. The 'rm' > step is delayed a few minutes so the tar generates enough data. On some > hosts the rm phase is fast and will clean the untar directory too > quickly. Thanks. I find this bug you reported is not introduced by my patch, it is a old bug. I have sent a RFC patch, I'll be very appreciated if you can get any comment. Regards Miao -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Aug 2, 2012 at 6:46 AM, David Sterba <dave@jikos.cz> wrote: ... > > Fsck spits lots of errors: > > ref mismatch on [1133031424 4096] extent item 1, found 0 > Backref 1133031424 root 5 not referenced back 0x7d1f40 > Incorrect global backref count on 1133031424 found 1 wanted 0 > backpointer mismatch on [1133031424 4096] > owner ref check failed [1133031424 4096] > > ref mismatch on [11213131776 16384] extent item 1, found 0 > Incorrect local backref count on 11213131776 root 5 owner 34509 offset 0 found 0 wanted 1 back 0x1424d8e0 > backpointer mismatch on [11213131776 16384] > owner ref check failed [11213131776 16384] > > fs tree 260 refs 6 not found > unresolved ref root 263 dir 256 index 4 namelen 14 name snap2748615355 error 600 > unresolved ref root 267 dir 256 index 4 namelen 14 name snap2748615355 error 600 > unresolved ref root 269 dir 256 index 4 namelen 14 name snap2748615355 error 600 > unresolved ref root 273 dir 256 index 4 namelen 14 name snap2748615355 error 600 > unresolved ref root 274 dir 256 index 4 namelen 14 name snap2748615355 error 600 > unresolved ref root 276 dir 256 index 4 namelen 14 name snap2748615355 error 600 > > > I've asked Josef to pull those patches out of btrfs-next, feel free to send me any testing > version if you can't reproduce it on your side. > I've run into similar errors after an unclean shutdown on a partition where I make use of several subvolumes. Some of the data in the subvolume is inaccessible, although the original root volume seems OK. So far, the partition is resisting my efforts to fix the errors. This unclean shutdown occurred while using a 3.5.3 kernel merged with the for-linus branch, so it did not contain any of Miao Xie's recent patches to address this issue. I've made an image of the corrupted volume if anybody has something they'd like me to test. But I'm primarily reporting this to let you know I'm seeing errors similar to the one's thrown off by your test case. I'm going to look into merging the patches from Josef's btrfs-next to see if the problem recurs. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c index 4e9c106..7943dc2 100644 --- a/fs/btrfs/transaction.c +++ b/fs/btrfs/transaction.c @@ -950,6 +950,8 @@ static noinline int create_pending_snapshot(struct btrfs_trans_handle *trans, struct btrfs_root *parent_root; struct btrfs_block_rsv *rsv; struct inode *parent_inode; + struct btrfs_path *path; + struct btrfs_dir_item *dir_item; struct dentry *parent; struct dentry *dentry; struct extent_buffer *tmp; @@ -962,6 +964,12 @@ static noinline int create_pending_snapshot(struct btrfs_trans_handle *trans, u64 root_flags; uuid_le new_uuid; + path = btrfs_alloc_path(); + if (!path) { + ret = pending->error = -ENOMEM; + goto path_alloc_fail; + } + new_root_item = kmalloc(sizeof(*new_root_item), GFP_NOFS); if (!new_root_item) { ret = pending->error = -ENOMEM; @@ -1010,22 +1018,20 @@ static noinline int create_pending_snapshot(struct btrfs_trans_handle *trans, */ ret = btrfs_set_inode_index(parent_inode, &index); BUG_ON(ret); /* -ENOMEM */ - ret = btrfs_insert_dir_item(trans, parent_root, - dentry->d_name.name, dentry->d_name.len, - parent_inode, &key, - BTRFS_FT_DIR, index); - if (ret == -EEXIST) { + + /* check if there is a file/dir which has the same name. */ + dir_item = btrfs_lookup_dir_item(NULL, parent_root, path, + btrfs_ino(parent_inode), + dentry->d_name.name, + dentry->d_name.len, 0); + if (dir_item != NULL && !IS_ERR(dir_item)) { pending->error = -EEXIST; goto fail; - } else if (ret) { + } else if (IS_ERR(dir_item)) { + ret = PTR_ERR(dir_item); goto abort_trans; } - - btrfs_i_size_write(parent_inode, parent_inode->i_size + - dentry->d_name.len * 2); - ret = btrfs_update_inode(trans, parent_root, parent_inode); - if (ret) - goto abort_trans; + btrfs_release_path(path); /* * pull in the delayed directory update @@ -1113,12 +1119,29 @@ static noinline int create_pending_snapshot(struct btrfs_trans_handle *trans, ret = btrfs_reloc_post_snapshot(trans, pending); if (ret) goto abort_trans; + + ret = btrfs_insert_dir_item(trans, parent_root, + dentry->d_name.name, dentry->d_name.len, + parent_inode, &key, + BTRFS_FT_DIR, index); + /* We have check then name at the beginning, so it is impossible. */ + BUG_ON(ret == -EEXIST); + if (ret) + goto abort_trans; + + btrfs_i_size_write(parent_inode, parent_inode->i_size + + dentry->d_name.len * 2); + ret = btrfs_update_inode(trans, parent_root, parent_inode); + if (ret) + goto abort_trans; fail: dput(parent); trans->block_rsv = rsv; no_free_objectid: kfree(new_root_item); root_item_alloc_fail: + btrfs_free_path(path); +path_alloc_fail: btrfs_block_rsv_release(root, &pending->block_rsv, (u64)-1); return ret; @@ -1444,13 +1467,28 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans, */ mutex_lock(&root->fs_info->reloc_mutex); - ret = btrfs_run_delayed_items(trans, root); + /* + * We needn't worry about the delayed items because we will + * deal with them in create_pending_snapshot(), which is the + * core function of the snapshot creation. + */ + ret = create_pending_snapshots(trans, root->fs_info); if (ret) { mutex_unlock(&root->fs_info->reloc_mutex); goto cleanup_transaction; } - ret = create_pending_snapshots(trans, root->fs_info); + /* + * We insert the dir indexes of the snapshots and update the inode + * of the snapshots' parents after the snapshot creation, so there + * are some delayed items which are not dealt with. Now deal with + * them. + * + * We needn't worry that this operation will corrupt the snapshots, + * because all the tree which are snapshoted will be forced to COW + * the nodes and leaves. + */ + ret = btrfs_run_delayed_items(trans, root); if (ret) { mutex_unlock(&root->fs_info->reloc_mutex); goto cleanup_transaction;
The snapshot should be the image of the fs tree before it was created, so the metadata of the snapshot should not exist in the its tree. But now, we found the directory item and directory name index is in both the snapshot tree and the fs tree. It introduces some problems and makes the users feel strange: # mkfs.btrfs /dev/sda1 # mount /dev/sda1 /mnt # mkdir /mnt/1 # cd /mnt/1 # btrfs subvolume snapshot /mnt snap0 # ls -a /mnt/1/snap0/1 . .. [no other file/dir] # ll /mnt/1/snap0/ total 0 drwxr-xr-x 1 root root 10 Ju1 24 12:11 1 ^^^ There is no file/dir in it, but it's size is 10 # cd /mnt/1/snap0/1/snap0 [Enter a unexisted directory successfully...] There is nothing in the directory 1 in snap0, but btrfs told the length of this directory is 10. Beside that, we can enter an unexisted directory, it is very strange to the users. # btrfs subvolume snapshot /mnt/1/snap0 /mnt/snap1 # ll /mnt/1/snap0/1/ total 0 [None] # ll /mnt/snap1/1/ total 0 drwxr-xr-x 1 root root 0 Ju1 24 12:14 snap0 And the source of snap1 did have any directory in Directory 1, but snap1 have a snap0, it is different between the source and the snapshot. So I think we should insert directory item and directory name index and update the parent inode as the last step of snapshot creation, and do not leave the useless metadata in the file tree. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> --- Changelog v2 -> v3: - rebase on the latest for-linus branch Changelog v1 -> v2: - add comment to explain why we need deal with the delayed items after snapshot creation and why this operation do not corrupt the metadata. - move dput() to patch 1/2 --- fs/btrfs/transaction.c | 66 +++++++++++++++++++++++++++++++++++++---------- 1 files changed, 52 insertions(+), 14 deletions(-)