diff mbox series

jbd2: add a missing data flush during file and fs synchronization

Message ID 20241206111327.4171337-1-yi.zhang@huaweicloud.com (mailing list archive)
State New
Headers show
Series jbd2: add a missing data flush during file and fs synchronization | expand

Commit Message

Zhang Yi Dec. 6, 2024, 11:13 a.m. UTC
From: Zhang Yi <yi.zhang@huawei.com>

When the filesystem performs file or filesystem synchronization (e.g.,
ext4_sync_file()), it queries the journal to determine whether to flush
the file device through jbd2_trans_will_send_data_barrier(). If the
target transaction has not started committing, it assumes that the
journal will submit the flush command, allowing the filesystem to bypass
a redundant flush command. However, this assumption is not always valid.
If the journal is not located on the filesystem device, the journal
commit thread will not submit the flush command unless the variable
->t_need_data_flush is set to 1. Consequently, the flush may be missed,
and data may be lost following a power failure or system crash, even if
the synchronization appears to succeed.

Unfortunately, we cannot determine with certainty whether the target
transaction will flush to the filesystem device before it commits.
However, if it has not started committing, it must be the running
transaction. Therefore, fix it by always set its t_need_data_flush to 1,
ensuring that the committing thread will flush the filesystem device.

Fixes: bbd2be369107 ("jbd2: Add function jbd2_trans_will_send_data_barrier()")
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
---
 fs/jbd2/journal.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

Comments

Jan Kara Dec. 6, 2024, 3:04 p.m. UTC | #1
On Fri 06-12-24 19:13:27, Zhang Yi wrote:
> From: Zhang Yi <yi.zhang@huawei.com>
> 
> When the filesystem performs file or filesystem synchronization (e.g.,
> ext4_sync_file()), it queries the journal to determine whether to flush
> the file device through jbd2_trans_will_send_data_barrier(). If the
> target transaction has not started committing, it assumes that the
> journal will submit the flush command, allowing the filesystem to bypass
> a redundant flush command. However, this assumption is not always valid.
> If the journal is not located on the filesystem device, the journal
> commit thread will not submit the flush command unless the variable
> ->t_need_data_flush is set to 1. Consequently, the flush may be missed,
> and data may be lost following a power failure or system crash, even if
> the synchronization appears to succeed.
> 
> Unfortunately, we cannot determine with certainty whether the target
> transaction will flush to the filesystem device before it commits.
> However, if it has not started committing, it must be the running
> transaction. Therefore, fix it by always set its t_need_data_flush to 1,
> ensuring that the committing thread will flush the filesystem device.
> 
> Fixes: bbd2be369107 ("jbd2: Add function jbd2_trans_will_send_data_barrier()")
> Signed-off-by: Zhang Yi <yi.zhang@huawei.com>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/jbd2/journal.c | 12 +++++++++++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
> index 97f487c3d8fc..37632ae18a4e 100644
> --- a/fs/jbd2/journal.c
> +++ b/fs/jbd2/journal.c
> @@ -609,7 +609,7 @@ int jbd2_journal_start_commit(journal_t *journal, tid_t *ptid)
>  int jbd2_trans_will_send_data_barrier(journal_t *journal, tid_t tid)
>  {
>  	int ret = 0;
> -	transaction_t *commit_trans;
> +	transaction_t *commit_trans, *running_trans;
>  
>  	if (!(journal->j_flags & JBD2_BARRIER))
>  		return 0;
> @@ -619,6 +619,16 @@ int jbd2_trans_will_send_data_barrier(journal_t *journal, tid_t tid)
>  		goto out;
>  	commit_trans = journal->j_committing_transaction;
>  	if (!commit_trans || commit_trans->t_tid != tid) {
> +		running_trans = journal->j_running_transaction;
> +		/*
> +		 * The query transaction hasn't started committing,
> +		 * it must still be running.
> +		 */
> +		if (WARN_ON_ONCE(!running_trans ||
> +				 running_trans->t_tid != tid))
> +			goto out;
> +
> +		running_trans->t_need_data_flush = 1;
>  		ret = 1;
>  		goto out;
>  	}
> -- 
> 2.46.1
>
Zhang Yi March 18, 2025, 4:37 a.m. UTC | #2
Hi, Ted.

Just wanted to kindly check if this patch might have been
overlooked?

Thanks,
Yi.

On 2024/12/6 19:13, Zhang Yi wrote:
> From: Zhang Yi <yi.zhang@huawei.com>
> 
> When the filesystem performs file or filesystem synchronization (e.g.,
> ext4_sync_file()), it queries the journal to determine whether to flush
> the file device through jbd2_trans_will_send_data_barrier(). If the
> target transaction has not started committing, it assumes that the
> journal will submit the flush command, allowing the filesystem to bypass
> a redundant flush command. However, this assumption is not always valid.
> If the journal is not located on the filesystem device, the journal
> commit thread will not submit the flush command unless the variable
> ->t_need_data_flush is set to 1. Consequently, the flush may be missed,
> and data may be lost following a power failure or system crash, even if
> the synchronization appears to succeed.
> 
> Unfortunately, we cannot determine with certainty whether the target
> transaction will flush to the filesystem device before it commits.
> However, if it has not started committing, it must be the running
> transaction. Therefore, fix it by always set its t_need_data_flush to 1,
> ensuring that the committing thread will flush the filesystem device.
> 
> Fixes: bbd2be369107 ("jbd2: Add function jbd2_trans_will_send_data_barrier()")
> Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
> ---
>  fs/jbd2/journal.c | 12 +++++++++++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
> index 97f487c3d8fc..37632ae18a4e 100644
> --- a/fs/jbd2/journal.c
> +++ b/fs/jbd2/journal.c
> @@ -609,7 +609,7 @@ int jbd2_journal_start_commit(journal_t *journal, tid_t *ptid)
>  int jbd2_trans_will_send_data_barrier(journal_t *journal, tid_t tid)
>  {
>  	int ret = 0;
> -	transaction_t *commit_trans;
> +	transaction_t *commit_trans, *running_trans;
>  
>  	if (!(journal->j_flags & JBD2_BARRIER))
>  		return 0;
> @@ -619,6 +619,16 @@ int jbd2_trans_will_send_data_barrier(journal_t *journal, tid_t tid)
>  		goto out;
>  	commit_trans = journal->j_committing_transaction;
>  	if (!commit_trans || commit_trans->t_tid != tid) {
> +		running_trans = journal->j_running_transaction;
> +		/*
> +		 * The query transaction hasn't started committing,
> +		 * it must still be running.
> +		 */
> +		if (WARN_ON_ONCE(!running_trans ||
> +				 running_trans->t_tid != tid))
> +			goto out;
> +
> +		running_trans->t_need_data_flush = 1;
>  		ret = 1;
>  		goto out;
>  	}
Theodore Ts'o March 22, 2025, 3:36 a.m. UTC | #3
On Fri, 06 Dec 2024 19:13:27 +0800, Zhang Yi wrote:
> When the filesystem performs file or filesystem synchronization (e.g.,
> ext4_sync_file()), it queries the journal to determine whether to flush
> the file device through jbd2_trans_will_send_data_barrier(). If the
> target transaction has not started committing, it assumes that the
> journal will submit the flush command, allowing the filesystem to bypass
> a redundant flush command. However, this assumption is not always valid.
> If the journal is not located on the filesystem device, the journal
> commit thread will not submit the flush command unless the variable
> ->t_need_data_flush is set to 1. Consequently, the flush may be missed,
> and data may be lost following a power failure or system crash, even if
> the synchronization appears to succeed.
> 
> [...]

Applied, thanks!

[1/1] jbd2: add a missing data flush during file and fs synchronization
      commit: aac45075f6d79a63ac8dff93b3e1d7053a6ba628

Best regards,
diff mbox series

Patch

diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
index 97f487c3d8fc..37632ae18a4e 100644
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -609,7 +609,7 @@  int jbd2_journal_start_commit(journal_t *journal, tid_t *ptid)
 int jbd2_trans_will_send_data_barrier(journal_t *journal, tid_t tid)
 {
 	int ret = 0;
-	transaction_t *commit_trans;
+	transaction_t *commit_trans, *running_trans;
 
 	if (!(journal->j_flags & JBD2_BARRIER))
 		return 0;
@@ -619,6 +619,16 @@  int jbd2_trans_will_send_data_barrier(journal_t *journal, tid_t tid)
 		goto out;
 	commit_trans = journal->j_committing_transaction;
 	if (!commit_trans || commit_trans->t_tid != tid) {
+		running_trans = journal->j_running_transaction;
+		/*
+		 * The query transaction hasn't started committing,
+		 * it must still be running.
+		 */
+		if (WARN_ON_ONCE(!running_trans ||
+				 running_trans->t_tid != tid))
+			goto out;
+
+		running_trans->t_need_data_flush = 1;
 		ret = 1;
 		goto out;
 	}