Message ID | 20220521101416.29793-1-heming.zhao@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/2] ocfs2: fix jbd2 assertion in defragment path | expand |
Hi, Sorry for the late response since I was busy on other things... On 5/21/22 6:14 PM, Heming Zhao wrote: > defragfs triggered jbd2 report ASSERT when working. > > code path: > > ocfs2_ioctl_move_extents > ocfs2_move_extents > __ocfs2_move_extents_range > ocfs2_defrag_extent > __ocfs2_move_extent > + ocfs2_journal_access_di > + ocfs2_split_extent //[1] > + ocfs2_journal_dirty //crash > > [1]: ocfs2_split_extent called ocfs2_extend_trans, which committed > dirty buffers then restarted the transaction. The changed transaction > triggered jbd2 ASSERT. > > crash stacks: > > PID: 11297 TASK: ffff974a676dcd00 CPU: 67 COMMAND: "defragfs.ocfs2" > #0 [ffffb25d8dad3900] machine_kexec at ffffffff8386fe01 > #1 [ffffb25d8dad3958] __crash_kexec at ffffffff8395959d > #2 [ffffb25d8dad3a20] crash_kexec at ffffffff8395a45d > #3 [ffffb25d8dad3a38] oops_end at ffffffff83836d3f > #4 [ffffb25d8dad3a58] do_trap at ffffffff83833205 > #5 [ffffb25d8dad3aa0] do_invalid_op at ffffffff83833aa6 > #6 [ffffb25d8dad3ac0] invalid_op at ffffffff84200d18 > [exception RIP: jbd2_journal_dirty_metadata+0x2ba] > RIP: ffffffffc09ca54a RSP: ffffb25d8dad3b70 RFLAGS: 00010207 > RAX: 0000000000000000 RBX: ffff9706eedc5248 RCX: 0000000000000000 > RDX: 0000000000000001 RSI: ffff97337029ea28 RDI: ffff9706eedc5250 > RBP: ffff9703c3520200 R8: 000000000f46b0b2 R9: 0000000000000000 > R10: 0000000000000001 R11: 00000001000000fe R12: ffff97337029ea28 > R13: 0000000000000000 R14: ffff9703de59bf60 R15: ffff9706eedc5250 > ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 > #7 [ffffb25d8dad3ba8] ocfs2_journal_dirty at ffffffffc137fb95 [ocfs2] > #8 [ffffb25d8dad3be8] __ocfs2_move_extent at ffffffffc139a950 [ocfs2] > #9 [ffffb25d8dad3c80] ocfs2_defrag_extent at ffffffffc139b2d2 [ocfs2] > > The fix method is simple, ocfs2_split_extent includes three paths. all > the paths already have journal access/dirty pair. We only need to > remove journal access/dirty from __ocfs2_move_extent. > I am not sure what you mean by "all three paths have journal access/ dirty pair". It seems we can't do it in your way, as journal access has different ocfs2_trigger with different offset, e.g. dinode is different from extent block. Or we may reserve enough credits to resolve this issue? Thanks, Joseph
On Thu, Jun 02, 2022 at 05:34:18PM +0800, Joseph Qi wrote: > Hi, > > Sorry for the late response since I was busy on other things... > > On 5/21/22 6:14 PM, Heming Zhao wrote: > > defragfs triggered jbd2 report ASSERT when working. > > > > code path: > > > > ocfs2_ioctl_move_extents > > ocfs2_move_extents > > __ocfs2_move_extents_range > > ocfs2_defrag_extent > > __ocfs2_move_extent > > + ocfs2_journal_access_di > > + ocfs2_split_extent //[1] > > + ocfs2_journal_dirty //crash > > > > [1]: ocfs2_split_extent called ocfs2_extend_trans, which committed > > dirty buffers then restarted the transaction. The changed transaction > > triggered jbd2 ASSERT. > > > > crash stacks: > > > > PID: 11297 TASK: ffff974a676dcd00 CPU: 67 COMMAND: "defragfs.ocfs2" > > #0 [ffffb25d8dad3900] machine_kexec at ffffffff8386fe01 > > #1 [ffffb25d8dad3958] __crash_kexec at ffffffff8395959d > > #2 [ffffb25d8dad3a20] crash_kexec at ffffffff8395a45d > > #3 [ffffb25d8dad3a38] oops_end at ffffffff83836d3f > > #4 [ffffb25d8dad3a58] do_trap at ffffffff83833205 > > #5 [ffffb25d8dad3aa0] do_invalid_op at ffffffff83833aa6 > > #6 [ffffb25d8dad3ac0] invalid_op at ffffffff84200d18 > > [exception RIP: jbd2_journal_dirty_metadata+0x2ba] > > RIP: ffffffffc09ca54a RSP: ffffb25d8dad3b70 RFLAGS: 00010207 > > RAX: 0000000000000000 RBX: ffff9706eedc5248 RCX: 0000000000000000 > > RDX: 0000000000000001 RSI: ffff97337029ea28 RDI: ffff9706eedc5250 > > RBP: ffff9703c3520200 R8: 000000000f46b0b2 R9: 0000000000000000 > > R10: 0000000000000001 R11: 00000001000000fe R12: ffff97337029ea28 > > R13: 0000000000000000 R14: ffff9703de59bf60 R15: ffff9706eedc5250 > > ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 > > #7 [ffffb25d8dad3ba8] ocfs2_journal_dirty at ffffffffc137fb95 [ocfs2] > > #8 [ffffb25d8dad3be8] __ocfs2_move_extent at ffffffffc139a950 [ocfs2] > > #9 [ffffb25d8dad3c80] ocfs2_defrag_extent at ffffffffc139b2d2 [ocfs2] > > > > The fix method is simple, ocfs2_split_extent includes three paths. all > > the paths already have journal access/dirty pair. We only need to > > remove journal access/dirty from __ocfs2_move_extent. > > > > I am not sure what you mean by "all three paths have journal access/ > dirty pair". I am a newbie for ocfs2 & jbd2, I can't make sure this [1/2] patch is correct. Below is my analysis. All three paths (below 1.[123]): __ocfs2_move_extent + ocfs2_journal_access_di | + ocfs2_split_extent //[1] | + ocfs2_replace_extent_rec //[1.1] | + ocfs2_split_and_insert //[1.2] | + ocfs2_try_to_merge_extent //[1.3] | + + ocfs2_journal_dirty //crash All three paths have itselves journal access/dirty pair. So the access/dirty pair in caller __ocfs2_move_extent is unnecessary. > > It seems we can't do it in your way, as journal access has different > ocfs2_trigger with different offset, e.g. dinode is different from > extent block. There are 3 ocfs2_split_extent callers: fs/ocfs2/alloc.c <<ocfs2_change_extent_flag>> fs/ocfs2/move_extents.c <<__ocfs2_move_extent>> fs/ocfs2/refcounttree.c <<ocfs2_clear_ext_refcount>> We can see, except defrag flow (caller __ocfs2_move_extent), other two don't use jbd2 access/dirty pair around ocfs2_split_extent. In my view, in defrag code flow, the struct ocfs2_triggers is changed many times. The outermost ocfs2_triggers (belongs to __ocfs2_move_extent) must be overwritten in sub-routine ocfs2_split_extent. In another perspective, why ocfs2_change_extent_flag & ocfs2_clear_ext_refcount don't use journal access/dirty pair to wrap ocfs2_split_extent? > > Or we may reserve enough credits to resolve this issue? > In my view, credits is not the key for this crash issue. The crash is triggered by transaction changed: crash code: ```jbd2_journal_dirty_metadata() if (data_race(jh->b_transaction != transaction && jh->b_next_transaction != transaction)) { spin_lock(&jh->b_state_lock); J_ASSERT_JH(jh, jh->b_transaction == transaction || jh->b_next_transaction == transaction); spin_unlock(&jh->b_state_lock); } ``` why transaction is changed? ocfs2_split_extent ocfs2_split_and_insert ocfs2_do_insert_extent ocfs2_insert_path ocfs2_extend_trans //change transaction, and pls note this func comments Thanks, Heming
On 6/4/22 8:08 AM, Heming Zhao wrote: > On Thu, Jun 02, 2022 at 05:34:18PM +0800, Joseph Qi wrote: >> Hi, >> >> Sorry for the late response since I was busy on other things... >> >> On 5/21/22 6:14 PM, Heming Zhao wrote: >>> defragfs triggered jbd2 report ASSERT when working. >>> >>> code path: >>> >>> ocfs2_ioctl_move_extents >>> ocfs2_move_extents >>> __ocfs2_move_extents_range >>> ocfs2_defrag_extent >>> __ocfs2_move_extent >>> + ocfs2_journal_access_di >>> + ocfs2_split_extent //[1] >>> + ocfs2_journal_dirty //crash >>> >>> [1]: ocfs2_split_extent called ocfs2_extend_trans, which committed >>> dirty buffers then restarted the transaction. The changed transaction >>> triggered jbd2 ASSERT. >>> >>> crash stacks: >>> >>> PID: 11297 TASK: ffff974a676dcd00 CPU: 67 COMMAND: "defragfs.ocfs2" >>> #0 [ffffb25d8dad3900] machine_kexec at ffffffff8386fe01 >>> #1 [ffffb25d8dad3958] __crash_kexec at ffffffff8395959d >>> #2 [ffffb25d8dad3a20] crash_kexec at ffffffff8395a45d >>> #3 [ffffb25d8dad3a38] oops_end at ffffffff83836d3f >>> #4 [ffffb25d8dad3a58] do_trap at ffffffff83833205 >>> #5 [ffffb25d8dad3aa0] do_invalid_op at ffffffff83833aa6 >>> #6 [ffffb25d8dad3ac0] invalid_op at ffffffff84200d18 >>> [exception RIP: jbd2_journal_dirty_metadata+0x2ba] >>> RIP: ffffffffc09ca54a RSP: ffffb25d8dad3b70 RFLAGS: 00010207 >>> RAX: 0000000000000000 RBX: ffff9706eedc5248 RCX: 0000000000000000 >>> RDX: 0000000000000001 RSI: ffff97337029ea28 RDI: ffff9706eedc5250 >>> RBP: ffff9703c3520200 R8: 000000000f46b0b2 R9: 0000000000000000 >>> R10: 0000000000000001 R11: 00000001000000fe R12: ffff97337029ea28 >>> R13: 0000000000000000 R14: ffff9703de59bf60 R15: ffff9706eedc5250 >>> ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 >>> #7 [ffffb25d8dad3ba8] ocfs2_journal_dirty at ffffffffc137fb95 [ocfs2] >>> #8 [ffffb25d8dad3be8] __ocfs2_move_extent at ffffffffc139a950 [ocfs2] >>> #9 [ffffb25d8dad3c80] ocfs2_defrag_extent at ffffffffc139b2d2 [ocfs2] >>> >>> The fix method is simple, ocfs2_split_extent includes three paths. all >>> the paths already have journal access/dirty pair. We only need to >>> remove journal access/dirty from __ocfs2_move_extent. >>> >> >> I am not sure what you mean by "all three paths have journal access/ >> dirty pair". > > I am a newbie for ocfs2 & jbd2, I can't make sure this [1/2] patch is correct. > Below is my analysis. > > All three paths (below 1.[123]): > > __ocfs2_move_extent > + ocfs2_journal_access_di > | > + ocfs2_split_extent //[1] > | + ocfs2_replace_extent_rec //[1.1] > | + ocfs2_split_and_insert //[1.2] > | + ocfs2_try_to_merge_extent //[1.3] > | > + + ocfs2_journal_dirty //crash > > All three paths have itselves journal access/dirty pair. > So the access/dirty pair in caller __ocfs2_move_extent is unnecessary. > ocfs2_journal_access_xxx() is to notify jbd2 to do a specific operation to a bh. I said in my last mail, they are for different bh. >> >> It seems we can't do it in your way, as journal access has different >> ocfs2_trigger with different offset, e.g. dinode is different from >> extent block. > > There are 3 ocfs2_split_extent callers: > fs/ocfs2/alloc.c <<ocfs2_change_extent_flag>> > fs/ocfs2/move_extents.c <<__ocfs2_move_extent>> > fs/ocfs2/refcounttree.c <<ocfs2_clear_ext_refcount>> > > We can see, except defrag flow (caller __ocfs2_move_extent), other two > don't use jbd2 access/dirty pair around ocfs2_split_extent. > Not exactly, you have to take look the whole flow. Thanks, Joseph
diff --git a/fs/ocfs2/move_extents.c b/fs/ocfs2/move_extents.c index 192cad0662d8..6251748c695b 100644 --- a/fs/ocfs2/move_extents.c +++ b/fs/ocfs2/move_extents.c @@ -105,14 +105,6 @@ static int __ocfs2_move_extent(handle_t *handle, */ replace_rec.e_flags = ext_flags & ~OCFS2_EXT_REFCOUNTED; - ret = ocfs2_journal_access_di(handle, INODE_CACHE(inode), - context->et.et_root_bh, - OCFS2_JOURNAL_ACCESS_WRITE); - if (ret) { - mlog_errno(ret); - goto out; - } - ret = ocfs2_split_extent(handle, &context->et, path, index, &replace_rec, context->meta_ac, &context->dealloc); @@ -121,8 +113,6 @@ static int __ocfs2_move_extent(handle_t *handle, goto out; } - ocfs2_journal_dirty(handle, context->et.et_root_bh); - context->new_phys_cpos = new_p_cpos; /*
defragfs triggered jbd2 report ASSERT when working. code path: ocfs2_ioctl_move_extents ocfs2_move_extents __ocfs2_move_extents_range ocfs2_defrag_extent __ocfs2_move_extent + ocfs2_journal_access_di + ocfs2_split_extent //[1] + ocfs2_journal_dirty //crash [1]: ocfs2_split_extent called ocfs2_extend_trans, which committed dirty buffers then restarted the transaction. The changed transaction triggered jbd2 ASSERT. crash stacks: PID: 11297 TASK: ffff974a676dcd00 CPU: 67 COMMAND: "defragfs.ocfs2" #0 [ffffb25d8dad3900] machine_kexec at ffffffff8386fe01 #1 [ffffb25d8dad3958] __crash_kexec at ffffffff8395959d #2 [ffffb25d8dad3a20] crash_kexec at ffffffff8395a45d #3 [ffffb25d8dad3a38] oops_end at ffffffff83836d3f #4 [ffffb25d8dad3a58] do_trap at ffffffff83833205 #5 [ffffb25d8dad3aa0] do_invalid_op at ffffffff83833aa6 #6 [ffffb25d8dad3ac0] invalid_op at ffffffff84200d18 [exception RIP: jbd2_journal_dirty_metadata+0x2ba] RIP: ffffffffc09ca54a RSP: ffffb25d8dad3b70 RFLAGS: 00010207 RAX: 0000000000000000 RBX: ffff9706eedc5248 RCX: 0000000000000000 RDX: 0000000000000001 RSI: ffff97337029ea28 RDI: ffff9706eedc5250 RBP: ffff9703c3520200 R8: 000000000f46b0b2 R9: 0000000000000000 R10: 0000000000000001 R11: 00000001000000fe R12: ffff97337029ea28 R13: 0000000000000000 R14: ffff9703de59bf60 R15: ffff9706eedc5250 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #7 [ffffb25d8dad3ba8] ocfs2_journal_dirty at ffffffffc137fb95 [ocfs2] #8 [ffffb25d8dad3be8] __ocfs2_move_extent at ffffffffc139a950 [ocfs2] #9 [ffffb25d8dad3c80] ocfs2_defrag_extent at ffffffffc139b2d2 [ocfs2] The fix method is simple, ocfs2_split_extent includes three paths. all the paths already have journal access/dirty pair. We only need to remove journal access/dirty from __ocfs2_move_extent. Signed-off-by: Heming Zhao <heming.zhao@suse.com> --- fs/ocfs2/move_extents.c | 10 ---------- 1 file changed, 10 deletions(-)