Message ID | be31d007-5104-e534-eec6-931ff5df5444@redhat.com (mailing list archive) |
---|---|
State | Accepted, archived |
Headers | show |
Series | [V2] xfs_repair: fix progress reporting | expand |
On Mon, May 18, 2020 at 08:29:40PM -0500, Eric Sandeen wrote: > The Fixes: commit tried to avoid a segfault in case the progress timer > went off before the first message type had been set up, but this > had the net effect of short-circuiting the pthread start routine, > and so the timer didn't get set up at all and we lost all fine-grained > progress reporting. > > The initial problem occurred when log zeroing took more time than the > timer interval. > > So, make a new log zeroing progress item and initialize it when we first > set up the timer thread, to be sure that if the timer goes off while we > are still zeroing the log, it will be initialized and correct. > > (We can't offer fine-grained status on log zeroing, so it'll go from > zero to $LOGBLOCKS with nothing in between, but it's unlikely that log > zeroing will take so long that this really matters.) Fixable in a separate patch if anyone else is motivated <wink>... > Reported-by: Leonardo Vaz <lvaz@redhat.com> > Fixes: 7f2d6b811755 ("xfs_repair: avoid segfault if reporting progre...") > Signed-off-by: Eric Sandeen <sandeen@redhat.com> Looks ok, Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> --D > --- > > diff --git a/repair/phase2.c b/repair/phase2.c > index 40ea2f84..952ac4a5 100644 > --- a/repair/phase2.c > +++ b/repair/phase2.c > @@ -120,6 +120,9 @@ zero_log( > do_error(_("failed to clear log")); > } > > + /* And we are now magically complete! */ > + PROG_RPT_INC(prog_rpt_done[0], mp->m_sb.sb_logblocks); > + > /* > * Finally, seed the max LSN from the current state of the log if this > * is a v5 filesystem. > @@ -160,7 +163,10 @@ phase2( > > /* Zero log if applicable */ > do_log(_(" - zero log...\n")); > + > + set_progress_msg(PROG_FMT_ZERO_LOG, (uint64_t)mp->m_sb.sb_logblocks); > zero_log(mp); > + print_final_rpt(); > > do_log(_(" - scan filesystem freespace and inode maps...\n")); > > diff --git a/repair/progress.c b/repair/progress.c > index 5ee08229..e5a9c1ef 100644 > --- a/repair/progress.c > +++ b/repair/progress.c > @@ -49,35 +49,37 @@ typedef struct progress_rpt_s { > > static > progress_rpt_t progress_rpt_reports[] = { > -{FMT1, N_("scanning filesystem freespace"), /* 0 */ > +{FMT1, N_("zeroing log"), /* 0 */ > + &rpt_fmts[FMT1], &rpt_types[TYPE_BLOCK]}, > +{FMT1, N_("scanning filesystem freespace"), /* 1 */ > &rpt_fmts[FMT1], &rpt_types[TYPE_AG]}, > -{FMT1, N_("scanning agi unlinked lists"), /* 1 */ > +{FMT1, N_("scanning agi unlinked lists"), /* 2 */ > &rpt_fmts[FMT1], &rpt_types[TYPE_AG]}, > -{FMT2, N_("check uncertain AG inodes"), /* 2 */ > +{FMT2, N_("check uncertain AG inodes"), /* 3 */ > &rpt_fmts[FMT2], &rpt_types[TYPE_AGI_BUCKET]}, > -{FMT1, N_("process known inodes and inode discovery"), /* 3 */ > +{FMT1, N_("process known inodes and inode discovery"), /* 4 */ > &rpt_fmts[FMT1], &rpt_types[TYPE_INODE]}, > -{FMT1, N_("process newly discovered inodes"), /* 4 */ > +{FMT1, N_("process newly discovered inodes"), /* 5 */ > &rpt_fmts[FMT1], &rpt_types[TYPE_AG]}, > -{FMT1, N_("setting up duplicate extent list"), /* 5 */ > +{FMT1, N_("setting up duplicate extent list"), /* 6 */ > &rpt_fmts[FMT1], &rpt_types[TYPE_AG]}, > -{FMT1, N_("initialize realtime bitmap"), /* 6 */ > +{FMT1, N_("initialize realtime bitmap"), /* 7 */ > &rpt_fmts[FMT1], &rpt_types[TYPE_BLOCK]}, > -{FMT1, N_("reset realtime bitmaps"), /* 7 */ > +{FMT1, N_("reset realtime bitmaps"), /* 8 */ > &rpt_fmts[FMT1], &rpt_types[TYPE_AG]}, > -{FMT1, N_("check for inodes claiming duplicate blocks"), /* 8 */ > +{FMT1, N_("check for inodes claiming duplicate blocks"), /* 9 */ > &rpt_fmts[FMT1], &rpt_types[TYPE_INODE]}, > -{FMT1, N_("rebuild AG headers and trees"), /* 9 */ > +{FMT1, N_("rebuild AG headers and trees"), /* 10 */ > &rpt_fmts[FMT1], &rpt_types[TYPE_AG]}, > -{FMT1, N_("traversing filesystem"), /* 10 */ > +{FMT1, N_("traversing filesystem"), /* 12 */ > &rpt_fmts[FMT1], &rpt_types[TYPE_AG]}, > -{FMT2, N_("traversing all unattached subtrees"), /* 11 */ > +{FMT2, N_("traversing all unattached subtrees"), /* 12 */ > &rpt_fmts[FMT2], &rpt_types[TYPE_DIR]}, > -{FMT2, N_("moving disconnected inodes to lost+found"), /* 12 */ > +{FMT2, N_("moving disconnected inodes to lost+found"), /* 13 */ > &rpt_fmts[FMT2], &rpt_types[TYPE_INODE]}, > -{FMT1, N_("verify and correct link counts"), /* 13 */ > +{FMT1, N_("verify and correct link counts"), /* 14 */ > &rpt_fmts[FMT1], &rpt_types[TYPE_AG]}, > -{FMT1, N_("verify link counts"), /* 14 */ > +{FMT1, N_("verify link counts"), /* 15 */ > &rpt_fmts[FMT1], &rpt_types[TYPE_AG]} > }; > > @@ -125,7 +127,8 @@ init_progress_rpt (void) > */ > > pthread_mutex_init(&global_msgs.mutex, NULL); > - global_msgs.format = NULL; > + /* Make sure the format is set to the first phase and not NULL */ > + global_msgs.format = &progress_rpt_reports[PROG_FMT_ZERO_LOG]; > global_msgs.count = glob_agcount; > global_msgs.interval = report_interval; > global_msgs.done = prog_rpt_done; > diff --git a/repair/progress.h b/repair/progress.h > index 9de9eb72..2c1690db 100644 > --- a/repair/progress.h > +++ b/repair/progress.h > @@ -8,26 +8,27 @@ > #define PHASE_END 1 > > > -#define PROG_FMT_SCAN_AG 0 /* Phase 2 */ > +#define PROG_FMT_ZERO_LOG 0 /* Phase 2 */ > +#define PROG_FMT_SCAN_AG 1 > > -#define PROG_FMT_AGI_UNLINKED 1 /* Phase 3 */ > -#define PROG_FMT_UNCERTAIN 2 > -#define PROG_FMT_PROCESS_INO 3 > -#define PROG_FMT_NEW_INODES 4 > +#define PROG_FMT_AGI_UNLINKED 2 /* Phase 3 */ > +#define PROG_FMT_UNCERTAIN 3 > +#define PROG_FMT_PROCESS_INO 4 > +#define PROG_FMT_NEW_INODES 5 > > -#define PROG_FMT_DUP_EXTENT 5 /* Phase 4 */ > -#define PROG_FMT_INIT_RTEXT 6 > -#define PROG_FMT_RESET_RTBM 7 > -#define PROG_FMT_DUP_BLOCKS 8 > +#define PROG_FMT_DUP_EXTENT 6 /* Phase 4 */ > +#define PROG_FMT_INIT_RTEXT 7 > +#define PROG_FMT_RESET_RTBM 8 > +#define PROG_FMT_DUP_BLOCKS 9 > > -#define PROG_FMT_REBUILD_AG 9 /* Phase 5 */ > +#define PROG_FMT_REBUILD_AG 10 /* Phase 5 */ > > -#define PROG_FMT_TRAVERSAL 10 /* Phase 6 */ > -#define PROG_FMT_TRAVERSSUB 11 > -#define PROG_FMT_DISCONINODE 12 > +#define PROG_FMT_TRAVERSAL 11 /* Phase 6 */ > +#define PROG_FMT_TRAVERSSUB 12 > +#define PROG_FMT_DISCONINODE 13 > > -#define PROGRESS_FMT_CORR_LINK 13 /* Phase 7 */ > -#define PROGRESS_FMT_VRFY_LINK 14 > +#define PROGRESS_FMT_CORR_LINK 14 /* Phase 7 */ > +#define PROGRESS_FMT_VRFY_LINK 15 > > #define DURATION_BUF_SIZE 512 > > >
On 19/05/2020 11:29, Eric Sandeen wrote: > The Fixes: commit tried to avoid a segfault in case the progress timer > went off before the first message type had been set up, but this > had the net effect of short-circuiting the pthread start routine, > and so the timer didn't get set up at all and we lost all fine-grained > progress reporting. > > The initial problem occurred when log zeroing took more time than the > timer interval. > > So, make a new log zeroing progress item and initialize it when we first > set up the timer thread, to be sure that if the timer goes off while we > are still zeroing the log, it will be initialized and correct. > > (We can't offer fine-grained status on log zeroing, so it'll go from > zero to $LOGBLOCKS with nothing in between, but it's unlikely that log > zeroing will take so long that this really matters.) > > Reported-by: Leonardo Vaz <lvaz@redhat.com> > Fixes: 7f2d6b811755 ("xfs_repair: avoid segfault if reporting progre...") > Signed-off-by: Eric Sandeen <sandeen@redhat.com> > --- I've been looking at this myself, got stuck writing an xfstest, which this passes, though the fix I was trying missed at least one of the formatters that this fixes, and the log zeroing is a nice touch. Reviewed-by: Donald Douwsma <ddouwsma@redhat.com> -- Don > > diff --git a/repair/phase2.c b/repair/phase2.c > index 40ea2f84..952ac4a5 100644 > --- a/repair/phase2.c > +++ b/repair/phase2.c > @@ -120,6 +120,9 @@ zero_log( > do_error(_("failed to clear log")); > } > > + /* And we are now magically complete! */ > + PROG_RPT_INC(prog_rpt_done[0], mp->m_sb.sb_logblocks); > + > /* > * Finally, seed the max LSN from the current state of the log if this > * is a v5 filesystem. > @@ -160,7 +163,10 @@ phase2( > > /* Zero log if applicable */ > do_log(_(" - zero log...\n")); > + > + set_progress_msg(PROG_FMT_ZERO_LOG, (uint64_t)mp->m_sb.sb_logblocks); > zero_log(mp); > + print_final_rpt(); > > do_log(_(" - scan filesystem freespace and inode maps...\n")); > > diff --git a/repair/progress.c b/repair/progress.c > index 5ee08229..e5a9c1ef 100644 > --- a/repair/progress.c > +++ b/repair/progress.c > @@ -49,35 +49,37 @@ typedef struct progress_rpt_s { > > static > progress_rpt_t progress_rpt_reports[] = { > -{FMT1, N_("scanning filesystem freespace"), /* 0 */ > +{FMT1, N_("zeroing log"), /* 0 */ > + &rpt_fmts[FMT1], &rpt_types[TYPE_BLOCK]}, > +{FMT1, N_("scanning filesystem freespace"), /* 1 */ > &rpt_fmts[FMT1], &rpt_types[TYPE_AG]}, > -{FMT1, N_("scanning agi unlinked lists"), /* 1 */ > +{FMT1, N_("scanning agi unlinked lists"), /* 2 */ > &rpt_fmts[FMT1], &rpt_types[TYPE_AG]}, > -{FMT2, N_("check uncertain AG inodes"), /* 2 */ > +{FMT2, N_("check uncertain AG inodes"), /* 3 */ > &rpt_fmts[FMT2], &rpt_types[TYPE_AGI_BUCKET]}, > -{FMT1, N_("process known inodes and inode discovery"), /* 3 */ > +{FMT1, N_("process known inodes and inode discovery"), /* 4 */ > &rpt_fmts[FMT1], &rpt_types[TYPE_INODE]}, > -{FMT1, N_("process newly discovered inodes"), /* 4 */ > +{FMT1, N_("process newly discovered inodes"), /* 5 */ > &rpt_fmts[FMT1], &rpt_types[TYPE_AG]}, > -{FMT1, N_("setting up duplicate extent list"), /* 5 */ > +{FMT1, N_("setting up duplicate extent list"), /* 6 */ > &rpt_fmts[FMT1], &rpt_types[TYPE_AG]}, > -{FMT1, N_("initialize realtime bitmap"), /* 6 */ > +{FMT1, N_("initialize realtime bitmap"), /* 7 */ > &rpt_fmts[FMT1], &rpt_types[TYPE_BLOCK]}, > -{FMT1, N_("reset realtime bitmaps"), /* 7 */ > +{FMT1, N_("reset realtime bitmaps"), /* 8 */ > &rpt_fmts[FMT1], &rpt_types[TYPE_AG]}, > -{FMT1, N_("check for inodes claiming duplicate blocks"), /* 8 */ > +{FMT1, N_("check for inodes claiming duplicate blocks"), /* 9 */ > &rpt_fmts[FMT1], &rpt_types[TYPE_INODE]}, > -{FMT1, N_("rebuild AG headers and trees"), /* 9 */ > +{FMT1, N_("rebuild AG headers and trees"), /* 10 */ > &rpt_fmts[FMT1], &rpt_types[TYPE_AG]}, > -{FMT1, N_("traversing filesystem"), /* 10 */ > +{FMT1, N_("traversing filesystem"), /* 12 */ > &rpt_fmts[FMT1], &rpt_types[TYPE_AG]}, > -{FMT2, N_("traversing all unattached subtrees"), /* 11 */ > +{FMT2, N_("traversing all unattached subtrees"), /* 12 */ > &rpt_fmts[FMT2], &rpt_types[TYPE_DIR]}, > -{FMT2, N_("moving disconnected inodes to lost+found"), /* 12 */ > +{FMT2, N_("moving disconnected inodes to lost+found"), /* 13 */ > &rpt_fmts[FMT2], &rpt_types[TYPE_INODE]}, > -{FMT1, N_("verify and correct link counts"), /* 13 */ > +{FMT1, N_("verify and correct link counts"), /* 14 */ > &rpt_fmts[FMT1], &rpt_types[TYPE_AG]}, > -{FMT1, N_("verify link counts"), /* 14 */ > +{FMT1, N_("verify link counts"), /* 15 */ > &rpt_fmts[FMT1], &rpt_types[TYPE_AG]} > }; > > @@ -125,7 +127,8 @@ init_progress_rpt (void) > */ > > pthread_mutex_init(&global_msgs.mutex, NULL); > - global_msgs.format = NULL; > + /* Make sure the format is set to the first phase and not NULL */ > + global_msgs.format = &progress_rpt_reports[PROG_FMT_ZERO_LOG]; > global_msgs.count = glob_agcount; > global_msgs.interval = report_interval; > global_msgs.done = prog_rpt_done; > diff --git a/repair/progress.h b/repair/progress.h > index 9de9eb72..2c1690db 100644 > --- a/repair/progress.h > +++ b/repair/progress.h > @@ -8,26 +8,27 @@ > #define PHASE_END 1 > > > -#define PROG_FMT_SCAN_AG 0 /* Phase 2 */ > +#define PROG_FMT_ZERO_LOG 0 /* Phase 2 */ > +#define PROG_FMT_SCAN_AG 1 > > -#define PROG_FMT_AGI_UNLINKED 1 /* Phase 3 */ > -#define PROG_FMT_UNCERTAIN 2 > -#define PROG_FMT_PROCESS_INO 3 > -#define PROG_FMT_NEW_INODES 4 > +#define PROG_FMT_AGI_UNLINKED 2 /* Phase 3 */ > +#define PROG_FMT_UNCERTAIN 3 > +#define PROG_FMT_PROCESS_INO 4 > +#define PROG_FMT_NEW_INODES 5 > > -#define PROG_FMT_DUP_EXTENT 5 /* Phase 4 */ > -#define PROG_FMT_INIT_RTEXT 6 > -#define PROG_FMT_RESET_RTBM 7 > -#define PROG_FMT_DUP_BLOCKS 8 > +#define PROG_FMT_DUP_EXTENT 6 /* Phase 4 */ > +#define PROG_FMT_INIT_RTEXT 7 > +#define PROG_FMT_RESET_RTBM 8 > +#define PROG_FMT_DUP_BLOCKS 9 > > -#define PROG_FMT_REBUILD_AG 9 /* Phase 5 */ > +#define PROG_FMT_REBUILD_AG 10 /* Phase 5 */ > > -#define PROG_FMT_TRAVERSAL 10 /* Phase 6 */ > -#define PROG_FMT_TRAVERSSUB 11 > -#define PROG_FMT_DISCONINODE 12 > +#define PROG_FMT_TRAVERSAL 11 /* Phase 6 */ > +#define PROG_FMT_TRAVERSSUB 12 > +#define PROG_FMT_DISCONINODE 13 > > -#define PROGRESS_FMT_CORR_LINK 13 /* Phase 7 */ > -#define PROGRESS_FMT_VRFY_LINK 14 > +#define PROGRESS_FMT_CORR_LINK 14 /* Phase 7 */ > +#define PROGRESS_FMT_VRFY_LINK 15 > > #define DURATION_BUF_SIZE 512 > > >
On 5/19/20 2:03 AM, Donald Douwsma wrote: > > On 19/05/2020 11:29, Eric Sandeen wrote: >> The Fixes: commit tried to avoid a segfault in case the progress timer >> went off before the first message type had been set up, but this >> had the net effect of short-circuiting the pthread start routine, >> and so the timer didn't get set up at all and we lost all fine-grained >> progress reporting. >> >> The initial problem occurred when log zeroing took more time than the >> timer interval. >> >> So, make a new log zeroing progress item and initialize it when we first >> set up the timer thread, to be sure that if the timer goes off while we >> are still zeroing the log, it will be initialized and correct. >> >> (We can't offer fine-grained status on log zeroing, so it'll go from >> zero to $LOGBLOCKS with nothing in between, but it's unlikely that log >> zeroing will take so long that this really matters.) >> >> Reported-by: Leonardo Vaz <lvaz@redhat.com> >> Fixes: 7f2d6b811755 ("xfs_repair: avoid segfault if reporting progre...") >> Signed-off-by: Eric Sandeen <sandeen@redhat.com> >> --- > I've been looking at this myself, got stuck writing an xfstest, which this > passes, though the fix I was trying missed at least one of the formatters > that this fixes, and the log zeroing is a nice touch. > > Reviewed-by: Donald Douwsma <ddouwsma@redhat.com> Hm, serves you right for writing a validation test. ;) sorry :( But: "the fix I was trying missed at least one of the formatters that this fixes" - which one is that? I didn't think I fixed any existing thing. -Eric
On 19/05/2020 22:53, Eric Sandeen wrote: > On 5/19/20 2:03 AM, Donald Douwsma wrote: >> >> On 19/05/2020 11:29, Eric Sandeen wrote: >>> The Fixes: commit tried to avoid a segfault in case the progress timer >>> went off before the first message type had been set up, but this >>> had the net effect of short-circuiting the pthread start routine, >>> and so the timer didn't get set up at all and we lost all fine-grained >>> progress reporting. >>> >>> The initial problem occurred when log zeroing took more time than the >>> timer interval. >>> >>> So, make a new log zeroing progress item and initialize it when we first >>> set up the timer thread, to be sure that if the timer goes off while we >>> are still zeroing the log, it will be initialized and correct. >>> >>> (We can't offer fine-grained status on log zeroing, so it'll go from >>> zero to $LOGBLOCKS with nothing in between, but it's unlikely that log >>> zeroing will take so long that this really matters.) >>> >>> Reported-by: Leonardo Vaz <lvaz@redhat.com> >>> Fixes: 7f2d6b811755 ("xfs_repair: avoid segfault if reporting progre...") >>> Signed-off-by: Eric Sandeen <sandeen@redhat.com> >>> --- >> I've been looking at this myself, got stuck writing an xfstest, which this >> passes, though the fix I was trying missed at least one of the formatters >> that this fixes, and the log zeroing is a nice touch. >> >> Reviewed-by: Donald Douwsma <ddouwsma@redhat.com> > > Hm, serves you right for writing a validation test. ;) sorry :( > My fix ended up being a bit... primitive, so all good! > But: "the fix I was trying missed at least one of the formatters > that this fixes" - which one is that? I didn't think I fixed any existing > thing. Actually its probably my test that's a bit busted, I wanted to catch the all of the formatting variations, but some are time sensitive, so its hit second vs seconds and my filter didn't cope. Ran: xfs/516 Failures: xfs/516 Failed 1 of 1 tests [root@rhel7 xfstests-dev]# diff -u /root/Devel/upstream/xfstests-dev/tests/xfs/516.out /root/Devel/upstream/xfstests-dev/results//xfs/516.out.bad --- /root/Devel/upstream/xfstests-dev/tests/xfs/516.out 2020-05-19 15:49:58.736465391 +1000 +++ /root/Devel/upstream/xfstests-dev/results//xfs/516.out.bad 2020-05-19 16:30:45.257220692 +1000 @@ -2,6 +2,7 @@ Format and populate Introduce a dmdelay Run repair + - #:#:#: Phase #: #% done - estimated remaining time # minutes, # second - #:#:#: Phase #: #% done - estimated remaining time # minutes, # seconds - #:#:#: Phase #: elapsed time # second - processed # inodes per minute - #:#:#: Phase #: elapsed time # seconds - processed # inodes per minute @@ -13,3 +14,4 @@ - #:#:#: scanning filesystem freespace - # of # allocation groups done - #:#:#: setting up duplicate extent list - # of # allocation groups done - #:#:#: verify and correct link counts - # of # allocation groups done + - #:#:#: zeroing log - # of # blocks done
diff --git a/repair/phase2.c b/repair/phase2.c index 40ea2f84..952ac4a5 100644 --- a/repair/phase2.c +++ b/repair/phase2.c @@ -120,6 +120,9 @@ zero_log( do_error(_("failed to clear log")); } + /* And we are now magically complete! */ + PROG_RPT_INC(prog_rpt_done[0], mp->m_sb.sb_logblocks); + /* * Finally, seed the max LSN from the current state of the log if this * is a v5 filesystem. @@ -160,7 +163,10 @@ phase2( /* Zero log if applicable */ do_log(_(" - zero log...\n")); + + set_progress_msg(PROG_FMT_ZERO_LOG, (uint64_t)mp->m_sb.sb_logblocks); zero_log(mp); + print_final_rpt(); do_log(_(" - scan filesystem freespace and inode maps...\n")); diff --git a/repair/progress.c b/repair/progress.c index 5ee08229..e5a9c1ef 100644 --- a/repair/progress.c +++ b/repair/progress.c @@ -49,35 +49,37 @@ typedef struct progress_rpt_s { static progress_rpt_t progress_rpt_reports[] = { -{FMT1, N_("scanning filesystem freespace"), /* 0 */ +{FMT1, N_("zeroing log"), /* 0 */ + &rpt_fmts[FMT1], &rpt_types[TYPE_BLOCK]}, +{FMT1, N_("scanning filesystem freespace"), /* 1 */ &rpt_fmts[FMT1], &rpt_types[TYPE_AG]}, -{FMT1, N_("scanning agi unlinked lists"), /* 1 */ +{FMT1, N_("scanning agi unlinked lists"), /* 2 */ &rpt_fmts[FMT1], &rpt_types[TYPE_AG]}, -{FMT2, N_("check uncertain AG inodes"), /* 2 */ +{FMT2, N_("check uncertain AG inodes"), /* 3 */ &rpt_fmts[FMT2], &rpt_types[TYPE_AGI_BUCKET]}, -{FMT1, N_("process known inodes and inode discovery"), /* 3 */ +{FMT1, N_("process known inodes and inode discovery"), /* 4 */ &rpt_fmts[FMT1], &rpt_types[TYPE_INODE]}, -{FMT1, N_("process newly discovered inodes"), /* 4 */ +{FMT1, N_("process newly discovered inodes"), /* 5 */ &rpt_fmts[FMT1], &rpt_types[TYPE_AG]}, -{FMT1, N_("setting up duplicate extent list"), /* 5 */ +{FMT1, N_("setting up duplicate extent list"), /* 6 */ &rpt_fmts[FMT1], &rpt_types[TYPE_AG]}, -{FMT1, N_("initialize realtime bitmap"), /* 6 */ +{FMT1, N_("initialize realtime bitmap"), /* 7 */ &rpt_fmts[FMT1], &rpt_types[TYPE_BLOCK]}, -{FMT1, N_("reset realtime bitmaps"), /* 7 */ +{FMT1, N_("reset realtime bitmaps"), /* 8 */ &rpt_fmts[FMT1], &rpt_types[TYPE_AG]}, -{FMT1, N_("check for inodes claiming duplicate blocks"), /* 8 */ +{FMT1, N_("check for inodes claiming duplicate blocks"), /* 9 */ &rpt_fmts[FMT1], &rpt_types[TYPE_INODE]}, -{FMT1, N_("rebuild AG headers and trees"), /* 9 */ +{FMT1, N_("rebuild AG headers and trees"), /* 10 */ &rpt_fmts[FMT1], &rpt_types[TYPE_AG]}, -{FMT1, N_("traversing filesystem"), /* 10 */ +{FMT1, N_("traversing filesystem"), /* 12 */ &rpt_fmts[FMT1], &rpt_types[TYPE_AG]}, -{FMT2, N_("traversing all unattached subtrees"), /* 11 */ +{FMT2, N_("traversing all unattached subtrees"), /* 12 */ &rpt_fmts[FMT2], &rpt_types[TYPE_DIR]}, -{FMT2, N_("moving disconnected inodes to lost+found"), /* 12 */ +{FMT2, N_("moving disconnected inodes to lost+found"), /* 13 */ &rpt_fmts[FMT2], &rpt_types[TYPE_INODE]}, -{FMT1, N_("verify and correct link counts"), /* 13 */ +{FMT1, N_("verify and correct link counts"), /* 14 */ &rpt_fmts[FMT1], &rpt_types[TYPE_AG]}, -{FMT1, N_("verify link counts"), /* 14 */ +{FMT1, N_("verify link counts"), /* 15 */ &rpt_fmts[FMT1], &rpt_types[TYPE_AG]} }; @@ -125,7 +127,8 @@ init_progress_rpt (void) */ pthread_mutex_init(&global_msgs.mutex, NULL); - global_msgs.format = NULL; + /* Make sure the format is set to the first phase and not NULL */ + global_msgs.format = &progress_rpt_reports[PROG_FMT_ZERO_LOG]; global_msgs.count = glob_agcount; global_msgs.interval = report_interval; global_msgs.done = prog_rpt_done; diff --git a/repair/progress.h b/repair/progress.h index 9de9eb72..2c1690db 100644 --- a/repair/progress.h +++ b/repair/progress.h @@ -8,26 +8,27 @@ #define PHASE_END 1 -#define PROG_FMT_SCAN_AG 0 /* Phase 2 */ +#define PROG_FMT_ZERO_LOG 0 /* Phase 2 */ +#define PROG_FMT_SCAN_AG 1 -#define PROG_FMT_AGI_UNLINKED 1 /* Phase 3 */ -#define PROG_FMT_UNCERTAIN 2 -#define PROG_FMT_PROCESS_INO 3 -#define PROG_FMT_NEW_INODES 4 +#define PROG_FMT_AGI_UNLINKED 2 /* Phase 3 */ +#define PROG_FMT_UNCERTAIN 3 +#define PROG_FMT_PROCESS_INO 4 +#define PROG_FMT_NEW_INODES 5 -#define PROG_FMT_DUP_EXTENT 5 /* Phase 4 */ -#define PROG_FMT_INIT_RTEXT 6 -#define PROG_FMT_RESET_RTBM 7 -#define PROG_FMT_DUP_BLOCKS 8 +#define PROG_FMT_DUP_EXTENT 6 /* Phase 4 */ +#define PROG_FMT_INIT_RTEXT 7 +#define PROG_FMT_RESET_RTBM 8 +#define PROG_FMT_DUP_BLOCKS 9 -#define PROG_FMT_REBUILD_AG 9 /* Phase 5 */ +#define PROG_FMT_REBUILD_AG 10 /* Phase 5 */ -#define PROG_FMT_TRAVERSAL 10 /* Phase 6 */ -#define PROG_FMT_TRAVERSSUB 11 -#define PROG_FMT_DISCONINODE 12 +#define PROG_FMT_TRAVERSAL 11 /* Phase 6 */ +#define PROG_FMT_TRAVERSSUB 12 +#define PROG_FMT_DISCONINODE 13 -#define PROGRESS_FMT_CORR_LINK 13 /* Phase 7 */ -#define PROGRESS_FMT_VRFY_LINK 14 +#define PROGRESS_FMT_CORR_LINK 14 /* Phase 7 */ +#define PROGRESS_FMT_VRFY_LINK 15 #define DURATION_BUF_SIZE 512
The Fixes: commit tried to avoid a segfault in case the progress timer went off before the first message type had been set up, but this had the net effect of short-circuiting the pthread start routine, and so the timer didn't get set up at all and we lost all fine-grained progress reporting. The initial problem occurred when log zeroing took more time than the timer interval. So, make a new log zeroing progress item and initialize it when we first set up the timer thread, to be sure that if the timer goes off while we are still zeroing the log, it will be initialized and correct. (We can't offer fine-grained status on log zeroing, so it'll go from zero to $LOGBLOCKS with nothing in between, but it's unlikely that log zeroing will take so long that this really matters.) Reported-by: Leonardo Vaz <lvaz@redhat.com> Fixes: 7f2d6b811755 ("xfs_repair: avoid segfault if reporting progre...") Signed-off-by: Eric Sandeen <sandeen@redhat.com> ---