diff mbox series

[v2] ocfs2: add error handling path when jbd2 enter ABORT status

Message ID 20230626112453.22571-1-heming.zhao@suse.com (mailing list archive)
State New, archived
Headers show
Series [v2] ocfs2: add error handling path when jbd2 enter ABORT status | expand

Commit Message

heming.zhao@suse.com June 26, 2023, 11:24 a.m. UTC
fstest generic cases 347 361 628 629 trigger a same issue:
When jbd2 enter ABORT status, ocfs2 ignores it and keep going to commit
journal. This issue causes umount failure (hanging).

This commit gives ocfs2 ability to handle jbd2 ABORT case. After
patching, umount successfully and leave journal dirty status.

Signed-off-by: Heming Zhao <heming.zhao@suse.com>
---
v2:
   (this v2 only for patch [2/2], see below URL)
   add description in commit log.
   don't reset ->j_num_trans and leave journal dirty status on disk.

v1: https://oss.oracle.com/pipermail/ocfs2-devel/2023-April/000896.html
---
 fs/ocfs2/alloc.c      | 10 ++++++----
 fs/ocfs2/journal.c    | 17 +++++++++++++++--
 fs/ocfs2/journal.h    |  5 +++--
 fs/ocfs2/localalloc.c |  3 +++
 4 files changed, 27 insertions(+), 8 deletions(-)

Comments

heming.zhao@suse.com June 26, 2023, 2:56 p.m. UTC | #1
On 6/26/23 7:24 PM, Heming Zhao wrote:
> fstest generic cases 347 361 628 629 trigger a same issue:
> When jbd2 enter ABORT status, ocfs2 ignores it and keep going to commit
> journal. This issue causes umount failure (hanging).
> 
> This commit gives ocfs2 ability to handle jbd2 ABORT case. After
> patching, umount successfully and leave journal dirty status.
> 
> Signed-off-by: Heming Zhao <heming.zhao@suse.com>
> ---
> v2:
>     (this v2 only for patch [2/2], see below URL)
>     add description in commit log.
>     don't reset ->j_num_trans and leave journal dirty status on disk.
> 
> v1: https://oss.oracle.com/pipermail/ocfs2-devel/2023-April/000896.html
> ---
>   fs/ocfs2/alloc.c      | 10 ++++++----
>   fs/ocfs2/journal.c    | 17 +++++++++++++++--
>   fs/ocfs2/journal.h    |  5 +++--
>   fs/ocfs2/localalloc.c |  3 +++
>   4 files changed, 27 insertions(+), 8 deletions(-)
> 

It looks I found 2 bugs when testing this patch.
1> fsck.ocfs2 doesn't succeed to repair unclean jbd2 data but output successfully.
2> fsck.ocfs2 doesn't remove dlm lockspace when exit.

** how to trigger **

first do below steps:
(/dev/vdg size should bigger than 1GB)

(these steps derived from fstest generic 361)
```

mkfs -t ocfs2 -F -N 4 --cluster-stack pcmk --cluster-name tst -b 4096 /dev/vdg 131072

mount -t ocfs2 /dev/vdg /fstest/scratch

/sbin/xfs_io -i -fc 'truncate 1g' /fstest/scratch/fs.img

mkdir /fstest/scratch/mnt

losetup -f --show /fstest/scratch/fs.img

losetup --direct-io=on /dev/loop0

mkfs -t ocfs2 -N 4 /dev/loop0

mount -t ocfs2 /dev/loop0 /fstest/scratch/mnt



/sbin/xfs_io -i -fc 'pwrite 0 520m' /fstest/scratch/mnt/testfile

/usr/bin/mount -o remount,ro /fstest/scratch/mnt



umount /fstest/scratch/mnt  (<-- trigger jbd2 abort)

```



the last 'umount' will trigger jbd2 enter abort status, then leave unclean
journal data on disk.

then using fsck.ocfs2 repair this loop0 device, this software output 'successfully',
but the real status is this partition still fails to mount.


```

tb-fstest1:~ # mount -t ocfs2 /dev/loop0 /fstest/scratch/mnt

mount.ocfs2: Internal logic failure while mounting /dev/loop0 on /fstest/scratch/mnt. Check 'dmesg' for more information on this error 5.



tb-fstest1:~ # fsck.ocfs2 /dev/loop0

fsck.ocfs2 1.8.7

Checking OCFS2 filesystem in /dev/loop0:

   Label:              <NONE>

   UUID:               4990448088164475A620ACD486ACFC8E

   Number of blocks:   262144

   Block size:         4096

   Number of clusters: 262144

   Cluster size:       4096

   Number of slots:    4



/dev/loop0 wasn't cleanly unmounted by all nodes.  Attempting to replay the journals for nodes that didn't unmount cleanly

Checking each slot's journal.

Replaying slot 0's journal.

*** There were problems replaying journals.  Be careful in telling fsck to make repairs to this filesystem.

Slot 0's local alloc replayed successfully

/dev/loop0 is clean.  It will be checked after 20 additional mounts.

Slot 0's journal dirty flag removed



tb-fstest1:~ # mount -t ocfs2 /dev/loop0 /fstest/scratch/mnt

mount.ocfs2: Internal logic failure while mounting /dev/loop0 on /fstest/scratch/mnt. Check 'dmesg' for more information on this error 17.


tb-fstest1:~ # ls /sys/kernel/dlm/

0B90A8FBD12C4ED3BB947682045C8B51  4990448088164475A620ACD486ACFC8E


tb-fstest1:~ # ps axj |grep ocfs2

     2  1232     0     0 ?           -1 I<       0   0:00 [ocfs2_wq]

     2  1233     0     0 ?           -1 S        0   0:00 [ocfs2dc-0B90A8FBD12C4ED3BB947682045C8B51]

     2  1240     0     0 ?           -1 S        0   0:00 [ocfs2cmt-0B90A8FBD12C4ED3BB947682045C8B51]


tb-fstest1:~ # dlm_tool leave 4990448088164475A620ACD486ACFC8E

Leaving lockspace "4990448088164475A620ACD486ACFC8E"

done


tb-fstest1:~ # mount -t ocfs2 /dev/loop0 /fstest/scratch/mnt

mount.ocfs2: Internal logic failure while mounting /dev/loop0 on /fstest/scratch/mnt. Check 'dmesg' for more information on this error 5.
```




Thanks,
Heming
heming.zhao@suse.com June 26, 2023, 3:09 p.m. UTC | #2
(sorry for last mail with mess format, resend)

On Mon, Jun 26, 2023 at 07:24:53PM +0800, Heming Zhao wrote:
> fstest generic cases 347 361 628 629 trigger a same issue:
> When jbd2 enter ABORT status, ocfs2 ignores it and keep going to commit
> journal. This issue causes umount failure (hanging).
> 
> This commit gives ocfs2 ability to handle jbd2 ABORT case. After
> patching, umount successfully and leave journal dirty status.
> 
> Signed-off-by: Heming Zhao <heming.zhao@suse.com>
> ---
> v2:
>    (this v2 only for patch [2/2], see below URL)
>    add description in commit log.
>    don't reset ->j_num_trans and leave journal dirty status on disk.
> 
> v1: https://oss.oracle.com/pipermail/ocfs2-devel/2023-April/000896.html
> ---
>  fs/ocfs2/alloc.c      | 10 ++++++----
>  fs/ocfs2/journal.c    | 17 +++++++++++++++--
>  fs/ocfs2/journal.h    |  5 +++--
>  fs/ocfs2/localalloc.c |  3 +++
>  4 files changed, 27 insertions(+), 8 deletions(-)
> 

It looks I found 2 bugs when testing this patch.
1> fsck.ocfs2 doesn't succeed to repair unclean jbd2 data but output successfully.
2> fsck.ocfs2 doesn't remove dlm lockspace when exit.

** how to trigger **

first do below steps:
(/dev/vdg size should bigger than 1GB)
(these steps derived from fstest generic 361)
```
mkfs -t ocfs2 -F -N 4 --cluster-stack pcmk --cluster-name tst -b 4096 /dev/vdg 131072
mount -t ocfs2 /dev/vdg /fstest/scratch
/sbin/xfs_io -i -fc 'truncate 1g' /fstest/scratch/fs.img
mkdir /fstest/scratch/mnt
losetup -f --show /fstest/scratch/fs.img
losetup --direct-io=on /dev/loop0
mkfs -t ocfs2 -N 4 /dev/loop0
mount -t ocfs2 /dev/loop0 /fstest/scratch/mnt

/sbin/xfs_io -i -fc 'pwrite 0 520m' /fstest/scratch/mnt/testfile
/usr/bin/mount -o remount,ro /fstest/scratch/mnt

umount /fstest/scratch/mnt  (<-- trigger jbd2 abort)
```

The last 'umount' will trigger jbd2 enter abort status, then leave unclean
journal data on disk.
Then using fsck.ocfs2 repair this loop0 device, this software output
'successfully', but the real status is this partition still fails to mount.

```
tb-fstest1:~ # mount -t ocfs2 /dev/loop0 /fstest/scratch/mnt
mount.ocfs2: Internal logic failure while mounting /dev/loop0 on /fstest/scratch/mnt. Check 'dmesg' for more information on this error 5.

tb-fstest1:~ # fsck.ocfs2 /dev/loop0
fsck.ocfs2 1.8.7
Checking OCFS2 filesystem in /dev/loop0:
  Label:              <NONE>
  UUID:               4990448088164475A620ACD486ACFC8E
  Number of blocks:   262144
  Block size:         4096
  Number of clusters: 262144
  Cluster size:       4096
  Number of slots:    4

/dev/loop0 wasn't cleanly unmounted by all nodes.  Attempting to replay the journals for nodes that didn't unmount cleanly
Checking each slot's journal.
Replaying slot 0's journal.
*** There were problems replaying journals.  Be careful in telling fsck to make repairs to this filesystem.
Slot 0's local alloc replayed successfully
/dev/loop0 is clean.  It will be checked after 20 additional mounts.
Slot 0's journal dirty flag removed

tb-fstest1:~ # mount -t ocfs2 /dev/loop0 /fstest/scratch/mnt
mount.ocfs2: Internal logic failure while mounting /dev/loop0 on
/fstest/scratch/mnt. Check 'dmesg' for more information on this
error 17.

tb-fstest1:~ # ls /sys/kernel/dlm/
0B90A8FBD12C4ED3BB947682045C8B51  4990448088164475A620ACD486ACFC8E

tb-fstest1:~ # ps axj |grep ocfs2
2  1232     0     0 ?           -1 I<       0   0:00 [ocfs2_wq]
2  1233     0     0 ?           -1 S        0   0:00 [ocfs2dc-0B90A8FBD12C4ED3BB947682045C8B51]
2  1240     0     0 ?           -1 S        0   0:00 [ocfs2cmt-0B90A8FBD12C4ED3BB947682045C8B51]

tb-fstest1:~ # dlm_tool leave 4990448088164475A620ACD486ACFC8E
Leaving lockspace "4990448088164475A620ACD486ACFC8E"
done

tb-fstest1:~ # mount -t ocfs2 /dev/loop0 /fstest/scratch/mnt
mount.ocfs2: Internal logic failure while mounting /dev/loop0 on /fstest/scratch/mnt. Check 'dmesg' for more information on this error 5.
```

Thanks,
Heming
diff mbox series

Patch

diff --git a/fs/ocfs2/alloc.c b/fs/ocfs2/alloc.c
index 51c93929a146..d90961a1c433 100644
--- a/fs/ocfs2/alloc.c
+++ b/fs/ocfs2/alloc.c
@@ -6308,11 +6308,13 @@  void ocfs2_truncate_log_shutdown(struct ocfs2_super *osb)
 
 	if (tl_inode) {
 		cancel_delayed_work(&osb->osb_truncate_log_wq);
-		flush_workqueue(osb->ocfs2_wq);
+		if (!is_journal_aborted(osb->journal->j_journal)) {
+			flush_workqueue(osb->ocfs2_wq);
 
-		status = ocfs2_flush_truncate_log(osb);
-		if (status < 0)
-			mlog_errno(status);
+			status = ocfs2_flush_truncate_log(osb);
+			if (status < 0)
+				mlog_errno(status);
+		}
 
 		brelse(osb->osb_tl_bh);
 		iput(osb->osb_tl_inode);
diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c
index 25d8072ccfce..30de277461b3 100644
--- a/fs/ocfs2/journal.c
+++ b/fs/ocfs2/journal.c
@@ -314,6 +314,11 @@  static int ocfs2_commit_cache(struct ocfs2_super *osb)
 	if (status < 0) {
 		up_write(&journal->j_trans_barrier);
 		mlog_errno(status);
+		if (is_journal_aborted(journal->j_journal)) {
+			ocfs2_error(osb->sb, "jbd2 status: ABORT.\n");
+			wake_up(&journal->j_checkpointed);
+			status = -EROFS;
+		}
 		goto finally;
 	}
 
@@ -1010,6 +1015,8 @@  void ocfs2_journal_shutdown(struct ocfs2_super *osb)
 	if (!journal)
 		goto done;
 
+	status = is_journal_aborted(journal->j_journal);
+
 	inode = journal->j_inode;
 
 	if (journal->j_state != OCFS2_JOURNAL_LOADED)
@@ -1038,9 +1045,10 @@  void ocfs2_journal_shutdown(struct ocfs2_super *osb)
 		osb->commit_task = NULL;
 	}
 
-	BUG_ON(atomic_read(&(osb->journal->j_num_trans)) != 0);
+	BUG_ON(!is_journal_aborted(journal->j_journal) &&
+			atomic_read(&(osb->journal->j_num_trans)) != 0);
 
-	if (ocfs2_mount_local(osb)) {
+	if (!status && ocfs2_mount_local(osb)) {
 		jbd2_journal_lock_updates(journal->j_journal);
 		status = jbd2_journal_flush(journal->j_journal, 0);
 		jbd2_journal_unlock_updates(journal->j_journal);
@@ -2352,6 +2360,11 @@  static int ocfs2_commit_thread(void *arg)
 			if (printk_timed_ratelimit(&abort_warn_time, 60*HZ))
 				mlog(ML_ERROR, "status = %d, journal is "
 						"already aborted.\n", status);
+
+			if (status == -EROFS) {
+				osb->commit_task = NULL;
+				return status;
+			}
 			/*
 			 * After ocfs2_commit_cache() fails, j_num_trans has a
 			 * non-zero value.  Sleep here to avoid a busy-wait
diff --git a/fs/ocfs2/journal.h b/fs/ocfs2/journal.h
index 41c382f68529..080726ffa24d 100644
--- a/fs/ocfs2/journal.h
+++ b/fs/ocfs2/journal.h
@@ -192,7 +192,7 @@  static inline void ocfs2_checkpoint_inode(struct inode *inode)
 {
 	struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
 
-	if (ocfs2_mount_local(osb))
+	if (ocfs2_mount_local(osb) || sb_rdonly(inode->i_sb))
 		return;
 
 	if (!ocfs2_ci_fully_checkpointed(INODE_CACHE(inode))) {
@@ -205,7 +205,8 @@  static inline void ocfs2_checkpoint_inode(struct inode *inode)
 		ocfs2_start_checkpoint(osb);
 
 		wait_event(osb->journal->j_checkpointed,
-			   ocfs2_ci_fully_checkpointed(INODE_CACHE(inode)));
+			   ocfs2_ci_fully_checkpointed(INODE_CACHE(inode)) ||
+			   sb_rdonly(inode->i_sb));
 	}
 }
 
diff --git a/fs/ocfs2/localalloc.c b/fs/ocfs2/localalloc.c
index c4426d12a2ad..e2e3400717b0 100644
--- a/fs/ocfs2/localalloc.c
+++ b/fs/ocfs2/localalloc.c
@@ -378,6 +378,9 @@  void ocfs2_shutdown_local_alloc(struct ocfs2_super *osb)
 	if (osb->ocfs2_wq)
 		flush_workqueue(osb->ocfs2_wq);
 
+	if (is_journal_aborted(osb->journal->j_journal))
+		goto out;
+
 	if (osb->local_alloc_state == OCFS2_LA_UNUSED)
 		goto out;