Message ID | 20230626112453.22571-1-heming.zhao@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] ocfs2: add error handling path when jbd2 enter ABORT status | expand |
On 6/26/23 7:24 PM, Heming Zhao wrote: > fstest generic cases 347 361 628 629 trigger a same issue: > When jbd2 enter ABORT status, ocfs2 ignores it and keep going to commit > journal. This issue causes umount failure (hanging). > > This commit gives ocfs2 ability to handle jbd2 ABORT case. After > patching, umount successfully and leave journal dirty status. > > Signed-off-by: Heming Zhao <heming.zhao@suse.com> > --- > v2: > (this v2 only for patch [2/2], see below URL) > add description in commit log. > don't reset ->j_num_trans and leave journal dirty status on disk. > > v1: https://oss.oracle.com/pipermail/ocfs2-devel/2023-April/000896.html > --- > fs/ocfs2/alloc.c | 10 ++++++---- > fs/ocfs2/journal.c | 17 +++++++++++++++-- > fs/ocfs2/journal.h | 5 +++-- > fs/ocfs2/localalloc.c | 3 +++ > 4 files changed, 27 insertions(+), 8 deletions(-) > It looks I found 2 bugs when testing this patch. 1> fsck.ocfs2 doesn't succeed to repair unclean jbd2 data but output successfully. 2> fsck.ocfs2 doesn't remove dlm lockspace when exit. ** how to trigger ** first do below steps: (/dev/vdg size should bigger than 1GB) (these steps derived from fstest generic 361) ``` mkfs -t ocfs2 -F -N 4 --cluster-stack pcmk --cluster-name tst -b 4096 /dev/vdg 131072 mount -t ocfs2 /dev/vdg /fstest/scratch /sbin/xfs_io -i -fc 'truncate 1g' /fstest/scratch/fs.img mkdir /fstest/scratch/mnt losetup -f --show /fstest/scratch/fs.img losetup --direct-io=on /dev/loop0 mkfs -t ocfs2 -N 4 /dev/loop0 mount -t ocfs2 /dev/loop0 /fstest/scratch/mnt /sbin/xfs_io -i -fc 'pwrite 0 520m' /fstest/scratch/mnt/testfile /usr/bin/mount -o remount,ro /fstest/scratch/mnt umount /fstest/scratch/mnt (<-- trigger jbd2 abort) ``` the last 'umount' will trigger jbd2 enter abort status, then leave unclean journal data on disk. then using fsck.ocfs2 repair this loop0 device, this software output 'successfully', but the real status is this partition still fails to mount. ``` tb-fstest1:~ # mount -t ocfs2 /dev/loop0 /fstest/scratch/mnt mount.ocfs2: Internal logic failure while mounting /dev/loop0 on /fstest/scratch/mnt. Check 'dmesg' for more information on this error 5. tb-fstest1:~ # fsck.ocfs2 /dev/loop0 fsck.ocfs2 1.8.7 Checking OCFS2 filesystem in /dev/loop0: Label: <NONE> UUID: 4990448088164475A620ACD486ACFC8E Number of blocks: 262144 Block size: 4096 Number of clusters: 262144 Cluster size: 4096 Number of slots: 4 /dev/loop0 wasn't cleanly unmounted by all nodes. Attempting to replay the journals for nodes that didn't unmount cleanly Checking each slot's journal. Replaying slot 0's journal. *** There were problems replaying journals. Be careful in telling fsck to make repairs to this filesystem. Slot 0's local alloc replayed successfully /dev/loop0 is clean. It will be checked after 20 additional mounts. Slot 0's journal dirty flag removed tb-fstest1:~ # mount -t ocfs2 /dev/loop0 /fstest/scratch/mnt mount.ocfs2: Internal logic failure while mounting /dev/loop0 on /fstest/scratch/mnt. Check 'dmesg' for more information on this error 17. tb-fstest1:~ # ls /sys/kernel/dlm/ 0B90A8FBD12C4ED3BB947682045C8B51 4990448088164475A620ACD486ACFC8E tb-fstest1:~ # ps axj |grep ocfs2 2 1232 0 0 ? -1 I< 0 0:00 [ocfs2_wq] 2 1233 0 0 ? -1 S 0 0:00 [ocfs2dc-0B90A8FBD12C4ED3BB947682045C8B51] 2 1240 0 0 ? -1 S 0 0:00 [ocfs2cmt-0B90A8FBD12C4ED3BB947682045C8B51] tb-fstest1:~ # dlm_tool leave 4990448088164475A620ACD486ACFC8E Leaving lockspace "4990448088164475A620ACD486ACFC8E" done tb-fstest1:~ # mount -t ocfs2 /dev/loop0 /fstest/scratch/mnt mount.ocfs2: Internal logic failure while mounting /dev/loop0 on /fstest/scratch/mnt. Check 'dmesg' for more information on this error 5. ``` Thanks, Heming
(sorry for last mail with mess format, resend) On Mon, Jun 26, 2023 at 07:24:53PM +0800, Heming Zhao wrote: > fstest generic cases 347 361 628 629 trigger a same issue: > When jbd2 enter ABORT status, ocfs2 ignores it and keep going to commit > journal. This issue causes umount failure (hanging). > > This commit gives ocfs2 ability to handle jbd2 ABORT case. After > patching, umount successfully and leave journal dirty status. > > Signed-off-by: Heming Zhao <heming.zhao@suse.com> > --- > v2: > (this v2 only for patch [2/2], see below URL) > add description in commit log. > don't reset ->j_num_trans and leave journal dirty status on disk. > > v1: https://oss.oracle.com/pipermail/ocfs2-devel/2023-April/000896.html > --- > fs/ocfs2/alloc.c | 10 ++++++---- > fs/ocfs2/journal.c | 17 +++++++++++++++-- > fs/ocfs2/journal.h | 5 +++-- > fs/ocfs2/localalloc.c | 3 +++ > 4 files changed, 27 insertions(+), 8 deletions(-) > It looks I found 2 bugs when testing this patch. 1> fsck.ocfs2 doesn't succeed to repair unclean jbd2 data but output successfully. 2> fsck.ocfs2 doesn't remove dlm lockspace when exit. ** how to trigger ** first do below steps: (/dev/vdg size should bigger than 1GB) (these steps derived from fstest generic 361) ``` mkfs -t ocfs2 -F -N 4 --cluster-stack pcmk --cluster-name tst -b 4096 /dev/vdg 131072 mount -t ocfs2 /dev/vdg /fstest/scratch /sbin/xfs_io -i -fc 'truncate 1g' /fstest/scratch/fs.img mkdir /fstest/scratch/mnt losetup -f --show /fstest/scratch/fs.img losetup --direct-io=on /dev/loop0 mkfs -t ocfs2 -N 4 /dev/loop0 mount -t ocfs2 /dev/loop0 /fstest/scratch/mnt /sbin/xfs_io -i -fc 'pwrite 0 520m' /fstest/scratch/mnt/testfile /usr/bin/mount -o remount,ro /fstest/scratch/mnt umount /fstest/scratch/mnt (<-- trigger jbd2 abort) ``` The last 'umount' will trigger jbd2 enter abort status, then leave unclean journal data on disk. Then using fsck.ocfs2 repair this loop0 device, this software output 'successfully', but the real status is this partition still fails to mount. ``` tb-fstest1:~ # mount -t ocfs2 /dev/loop0 /fstest/scratch/mnt mount.ocfs2: Internal logic failure while mounting /dev/loop0 on /fstest/scratch/mnt. Check 'dmesg' for more information on this error 5. tb-fstest1:~ # fsck.ocfs2 /dev/loop0 fsck.ocfs2 1.8.7 Checking OCFS2 filesystem in /dev/loop0: Label: <NONE> UUID: 4990448088164475A620ACD486ACFC8E Number of blocks: 262144 Block size: 4096 Number of clusters: 262144 Cluster size: 4096 Number of slots: 4 /dev/loop0 wasn't cleanly unmounted by all nodes. Attempting to replay the journals for nodes that didn't unmount cleanly Checking each slot's journal. Replaying slot 0's journal. *** There were problems replaying journals. Be careful in telling fsck to make repairs to this filesystem. Slot 0's local alloc replayed successfully /dev/loop0 is clean. It will be checked after 20 additional mounts. Slot 0's journal dirty flag removed tb-fstest1:~ # mount -t ocfs2 /dev/loop0 /fstest/scratch/mnt mount.ocfs2: Internal logic failure while mounting /dev/loop0 on /fstest/scratch/mnt. Check 'dmesg' for more information on this error 17. tb-fstest1:~ # ls /sys/kernel/dlm/ 0B90A8FBD12C4ED3BB947682045C8B51 4990448088164475A620ACD486ACFC8E tb-fstest1:~ # ps axj |grep ocfs2 2 1232 0 0 ? -1 I< 0 0:00 [ocfs2_wq] 2 1233 0 0 ? -1 S 0 0:00 [ocfs2dc-0B90A8FBD12C4ED3BB947682045C8B51] 2 1240 0 0 ? -1 S 0 0:00 [ocfs2cmt-0B90A8FBD12C4ED3BB947682045C8B51] tb-fstest1:~ # dlm_tool leave 4990448088164475A620ACD486ACFC8E Leaving lockspace "4990448088164475A620ACD486ACFC8E" done tb-fstest1:~ # mount -t ocfs2 /dev/loop0 /fstest/scratch/mnt mount.ocfs2: Internal logic failure while mounting /dev/loop0 on /fstest/scratch/mnt. Check 'dmesg' for more information on this error 5. ``` Thanks, Heming
diff --git a/fs/ocfs2/alloc.c b/fs/ocfs2/alloc.c index 51c93929a146..d90961a1c433 100644 --- a/fs/ocfs2/alloc.c +++ b/fs/ocfs2/alloc.c @@ -6308,11 +6308,13 @@ void ocfs2_truncate_log_shutdown(struct ocfs2_super *osb) if (tl_inode) { cancel_delayed_work(&osb->osb_truncate_log_wq); - flush_workqueue(osb->ocfs2_wq); + if (!is_journal_aborted(osb->journal->j_journal)) { + flush_workqueue(osb->ocfs2_wq); - status = ocfs2_flush_truncate_log(osb); - if (status < 0) - mlog_errno(status); + status = ocfs2_flush_truncate_log(osb); + if (status < 0) + mlog_errno(status); + } brelse(osb->osb_tl_bh); iput(osb->osb_tl_inode); diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c index 25d8072ccfce..30de277461b3 100644 --- a/fs/ocfs2/journal.c +++ b/fs/ocfs2/journal.c @@ -314,6 +314,11 @@ static int ocfs2_commit_cache(struct ocfs2_super *osb) if (status < 0) { up_write(&journal->j_trans_barrier); mlog_errno(status); + if (is_journal_aborted(journal->j_journal)) { + ocfs2_error(osb->sb, "jbd2 status: ABORT.\n"); + wake_up(&journal->j_checkpointed); + status = -EROFS; + } goto finally; } @@ -1010,6 +1015,8 @@ void ocfs2_journal_shutdown(struct ocfs2_super *osb) if (!journal) goto done; + status = is_journal_aborted(journal->j_journal); + inode = journal->j_inode; if (journal->j_state != OCFS2_JOURNAL_LOADED) @@ -1038,9 +1045,10 @@ void ocfs2_journal_shutdown(struct ocfs2_super *osb) osb->commit_task = NULL; } - BUG_ON(atomic_read(&(osb->journal->j_num_trans)) != 0); + BUG_ON(!is_journal_aborted(journal->j_journal) && + atomic_read(&(osb->journal->j_num_trans)) != 0); - if (ocfs2_mount_local(osb)) { + if (!status && ocfs2_mount_local(osb)) { jbd2_journal_lock_updates(journal->j_journal); status = jbd2_journal_flush(journal->j_journal, 0); jbd2_journal_unlock_updates(journal->j_journal); @@ -2352,6 +2360,11 @@ static int ocfs2_commit_thread(void *arg) if (printk_timed_ratelimit(&abort_warn_time, 60*HZ)) mlog(ML_ERROR, "status = %d, journal is " "already aborted.\n", status); + + if (status == -EROFS) { + osb->commit_task = NULL; + return status; + } /* * After ocfs2_commit_cache() fails, j_num_trans has a * non-zero value. Sleep here to avoid a busy-wait diff --git a/fs/ocfs2/journal.h b/fs/ocfs2/journal.h index 41c382f68529..080726ffa24d 100644 --- a/fs/ocfs2/journal.h +++ b/fs/ocfs2/journal.h @@ -192,7 +192,7 @@ static inline void ocfs2_checkpoint_inode(struct inode *inode) { struct ocfs2_super *osb = OCFS2_SB(inode->i_sb); - if (ocfs2_mount_local(osb)) + if (ocfs2_mount_local(osb) || sb_rdonly(inode->i_sb)) return; if (!ocfs2_ci_fully_checkpointed(INODE_CACHE(inode))) { @@ -205,7 +205,8 @@ static inline void ocfs2_checkpoint_inode(struct inode *inode) ocfs2_start_checkpoint(osb); wait_event(osb->journal->j_checkpointed, - ocfs2_ci_fully_checkpointed(INODE_CACHE(inode))); + ocfs2_ci_fully_checkpointed(INODE_CACHE(inode)) || + sb_rdonly(inode->i_sb)); } } diff --git a/fs/ocfs2/localalloc.c b/fs/ocfs2/localalloc.c index c4426d12a2ad..e2e3400717b0 100644 --- a/fs/ocfs2/localalloc.c +++ b/fs/ocfs2/localalloc.c @@ -378,6 +378,9 @@ void ocfs2_shutdown_local_alloc(struct ocfs2_super *osb) if (osb->ocfs2_wq) flush_workqueue(osb->ocfs2_wq); + if (is_journal_aborted(osb->journal->j_journal)) + goto out; + if (osb->local_alloc_state == OCFS2_LA_UNUSED) goto out;
fstest generic cases 347 361 628 629 trigger a same issue: When jbd2 enter ABORT status, ocfs2 ignores it and keep going to commit journal. This issue causes umount failure (hanging). This commit gives ocfs2 ability to handle jbd2 ABORT case. After patching, umount successfully and leave journal dirty status. Signed-off-by: Heming Zhao <heming.zhao@suse.com> --- v2: (this v2 only for patch [2/2], see below URL) add description in commit log. don't reset ->j_num_trans and leave journal dirty status on disk. v1: https://oss.oracle.com/pipermail/ocfs2-devel/2023-April/000896.html --- fs/ocfs2/alloc.c | 10 ++++++---- fs/ocfs2/journal.c | 17 +++++++++++++++-- fs/ocfs2/journal.h | 5 +++-- fs/ocfs2/localalloc.c | 3 +++ 4 files changed, 27 insertions(+), 8 deletions(-)