xfs_repair: notify user when cache flush starts

Message ID	3ca21cbc-fbe2-4c43-b8af-50bc7467b9cd@redhat.com (mailing list archive)
State	Deferred, archived
Headers	show Return-Path: <linux-xfs-owner@vger.kernel.org> Message-ID: <3ca21cbc-fbe2-4c43-b8af-50bc7467b9cd@redhat.com> Date: Fri, 10 Nov 2023 12:22:30 -0600 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Content-Language: en-US To: "linux-xfs@vger.kernel.org" <linux-xfs@vger.kernel.org> From: Eric Sandeen <sandeen@redhat.com> Subject: [PATCH] xfs_repair: notify user when cache flush starts Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Precedence: bulk
Series	xfs_repair: notify user when cache flush starts \| expand xfs_repair: notify user when cache flush starts

Message ID

3ca21cbc-fbe2-4c43-b8af-50bc7467b9cd@redhat.com (mailing list archive)

State

Deferred, archived

Headers

Message-ID: <3ca21cbc-fbe2-4c43-b8af-50bc7467b9cd@redhat.com>
Date: Fri, 10 Nov 2023 12:22:30 -0600
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Content-Language: en-US
To: "linux-xfs@vger.kernel.org" <linux-xfs@vger.kernel.org>
From: Eric Sandeen <sandeen@redhat.com>
Subject: [PATCH] xfs_repair: notify user when cache flush starts
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
Precedence: bulk

Series

xfs_repair: notify user when cache flush starts | expand

Commit Message

Eric Sandeen Nov. 10, 2023, 6:22 p.m. UTC

We recently had the opportunity to run xfs_repair on a system
with 2T of memory and over a billion inodes. After phase 7
had completed, xfs_repair appeared to have hung for over an
hour as the massive cache was written back.

In the long run it might be nice to see if we can add progress
reporting to the cache flush if it's sufficiently large, but
for now at least let the user know what's going on.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
---

Comments

Darrick J. Wong Nov. 10, 2023, 7:40 p.m. UTC | #1

On Fri, Nov 10, 2023 at 12:22:30PM -0600, Eric Sandeen wrote:
> We recently had the opportunity to run xfs_repair on a system
> with 2T of memory and over a billion inodes. After phase 7
> had completed, xfs_repair appeared to have hung for over an
> hour as the massive cache was written back.
> 
> In the long run it might be nice to see if we can add progress
> reporting to the cache flush if it's sufficiently large, but
> for now at least let the user know what's going on.
> 
> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
> ---
> 
> diff --git a/repair/xfs_repair.c b/repair/xfs_repair.c
> index ff29bea9..5597b9ba 100644
> --- a/repair/xfs_repair.c
> +++ b/repair/xfs_repair.c
> @@ -1388,6 +1388,7 @@
>  	 * verifiers are run (where we discover the max metadata LSN), reformat
>  	 * the log if necessary and unmount.
>  	 */
> +	do_log(_("Flushing cache...\n"));

Does this new message affect the golden output of any fstests?

Also, while this /does/ flush the libxfs b(uffer)cache, I worry that
the phrasing will lead people into thinking that *disk* caches are being
flushed.  That doesn't happen until libxfs_close_devices.  I suggest:

"Writing dirty buffers..." ?

--D

>  	libxfs_bcache_flush();
>  	format_log_max_lsn(mp);
> 
>

Eric Sandeen Nov. 13, 2023, 4:33 p.m. UTC | #2

On 11/10/23 1:40 PM, Darrick J. Wong wrote:
> On Fri, Nov 10, 2023 at 12:22:30PM -0600, Eric Sandeen wrote:
>> We recently had the opportunity to run xfs_repair on a system
>> with 2T of memory and over a billion inodes. After phase 7
>> had completed, xfs_repair appeared to have hung for over an
>> hour as the massive cache was written back.
>>
>> In the long run it might be nice to see if we can add progress
>> reporting to the cache flush if it's sufficiently large, but
>> for now at least let the user know what's going on.
>>
>> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
>> ---
>>
>> diff --git a/repair/xfs_repair.c b/repair/xfs_repair.c
>> index ff29bea9..5597b9ba 100644
>> --- a/repair/xfs_repair.c
>> +++ b/repair/xfs_repair.c
>> @@ -1388,6 +1388,7 @@
>>  	 * verifiers are run (where we discover the max metadata LSN), reformat
>>  	 * the log if necessary and unmount.
>>  	 */
>> +	do_log(_("Flushing cache...\n"));
> 
> Does this new message affect the golden output of any fstests?

Ugh, yes, 6 or so. I'll add something to _filter_repair().

> 
> Also, while this /does/ flush the libxfs b(uffer)cache, I worry that
> the phrasing will lead people into thinking that *disk* caches are being
> flushed.  That doesn't happen until libxfs_close_devices.  I suggest:
> 
> "Writing dirty buffers..." ?

That's an excellent suggestion. I'll follow up w/ that change and a fix for
xfstests, thanks for thinking of that.

-Eric

> --D
> 
>>  	libxfs_bcache_flush();
>>  	format_log_max_lsn(mp);
>>
>>
>

diff --git a/repair/xfs_repair.c b/repair/xfs_repair.c
index ff29bea9..5597b9ba 100644
--- a/repair/xfs_repair.c
+++ b/repair/xfs_repair.c
@@ -1388,6 +1388,7 @@ 
 	 * verifiers are run (where we discover the max metadata LSN), reformat
 	 * the log if necessary and unmount.
 	 */
+	do_log(_("Flushing cache...\n"));
 	libxfs_bcache_flush();
 	format_log_max_lsn(mp);

xfs_repair: notify user when cache flush starts

Commit Message

Comments

Patch