Message ID | 1563758631-29550-20-git-send-email-jsimmons@infradead.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | ldiskfs patches against 5.2-rc2+ | expand |
On Sun, Jul 21 2019, James Simmons wrote: > When ldiskfs run in failover mode whith read-only disk. > Part of allocation updates are lost and ldiskfs may fail > while mounting this is due to inconsistent state of > group-descriptor. Group-descriptor check is added after > journal replay. I think this needs to be enabled by a mount option or super-block flag. NeilBrown > > Signed-off-by: James Simmons <jsimmons@infradead.org> > --- > fs/ext4/super.c | 11 ++++++----- > 1 file changed, 6 insertions(+), 5 deletions(-) > > diff --git a/fs/ext4/super.c b/fs/ext4/super.c > index a3179b2..b818acb 100644 > --- a/fs/ext4/super.c > +++ b/fs/ext4/super.c > @@ -4255,11 +4255,6 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent) > } > } > sbi->s_gdb_count = db_count; > - if (!ext4_check_descriptors(sb, logical_sb_block, &first_not_zeroed)) { > - ext4_msg(sb, KERN_ERR, "group descriptors corrupted!"); > - ret = -EFSCORRUPTED; > - goto failed_mount2; > - } > > timer_setup(&sbi->s_err_report, print_daily_error_info, 0); > > @@ -4401,6 +4396,12 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent) > sbi->s_journal->j_commit_callback = ext4_journal_commit_callback; > > no_journal: > + if (!ext4_check_descriptors(sb, logical_sb_block, &first_not_zeroed)) { > + ext4_msg(sb, KERN_ERR, "group descriptors corrupted!"); > + ret = -EFSCORRUPTED; > + goto failed_mount_wq; > + } > + > if (!test_opt(sb, NO_MBCACHE)) { > sbi->s_ea_block_cache = ext4_xattr_create_cache(); > if (!sbi->s_ea_block_cache) { > -- > 1.8.3.1
On Mon, Jul 22 2019, Alexey Lyashkov wrote: > Why? > Purpose of this patch is simple and don’t addressed a failover in general. > Crash can occurred in commit time - when journal _partially_ flushed to the FS. Checking any FS metadata in this time is wrong, because we have no guarantee to be consistence. > But checking an after journal replay is fine, it check have verification no corruption hit and FS is fine. If the corruption can occur in non-ldiskfs usage, and would be fixed by a journal replay, then yes - the patch looks like a good idea. Possibly I misunderstood the source of the corruption... maybe if that could be made clearer in the commit message, that would help. Thanks, NeilBrown > > >> 22 июля 2019 г., в 8:29, NeilBrown <neilb@suse.com> написал(а): >> >> On Sun, Jul 21 2019, James Simmons wrote: >> >>> When ldiskfs run in failover mode whith read-only disk. >>> Part of allocation updates are lost and ldiskfs may fail >>> while mounting this is due to inconsistent state of >>> group-descriptor. Group-descriptor check is added after >>> journal replay. >> >> I think this needs to be enabled by a mount option or super-block flag. >> >> NeilBrown >> >> >>> >>> Signed-off-by: James Simmons <jsimmons@infradead.org> >>> --- >>> fs/ext4/super.c | 11 ++++++----- >>> 1 file changed, 6 insertions(+), 5 deletions(-) >>> >>> diff --git a/fs/ext4/super.c b/fs/ext4/super.c >>> index a3179b2..b818acb 100644 >>> --- a/fs/ext4/super.c >>> +++ b/fs/ext4/super.c >>> @@ -4255,11 +4255,6 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent) >>> } >>> } >>> sbi->s_gdb_count = db_count; >>> - if (!ext4_check_descriptors(sb, logical_sb_block, &first_not_zeroed)) { >>> - ext4_msg(sb, KERN_ERR, "group descriptors corrupted!"); >>> - ret = -EFSCORRUPTED; >>> - goto failed_mount2; >>> - } >>> >>> timer_setup(&sbi->s_err_report, print_daily_error_info, 0); >>> >>> @@ -4401,6 +4396,12 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent) >>> sbi->s_journal->j_commit_callback = ext4_journal_commit_callback; >>> >>> no_journal: >>> + if (!ext4_check_descriptors(sb, logical_sb_block, &first_not_zeroed)) { >>> + ext4_msg(sb, KERN_ERR, "group descriptors corrupted!"); >>> + ret = -EFSCORRUPTED; >>> + goto failed_mount_wq; >>> + } >>> + >>> if (!test_opt(sb, NO_MBCACHE)) { >>> sbi->s_ea_block_cache = ext4_xattr_create_cache(); >>> if (!sbi->s_ea_block_cache) { >>> -- >>> 1.8.3.1 >> _______________________________________________ >> lustre-devel mailing list >> lustre-devel@lists.lustre.org >> http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org
> On Jul 22, 2019, at 2:46 AM, NeilBrown <neilb@suse.com> wrote: > > On Mon, Jul 22 2019, Alexey Lyashkov wrote: > >> Why? >> Purpose of this patch is simple and don’t addressed a failover in general. >> Crash can occurred in commit time - when journal _partially_ flushed to the FS. Checking any FS metadata in this time is wrong, because we have no guarantee to be consistence. >> But checking an after journal replay is fine, it check have verification no corruption hit and FS is fine. > > If the corruption can occur in non-ldiskfs usage, and would be fixed by > a journal replay, then yes - the patch looks like a good idea. > > Possibly I misunderstood the source of the corruption... maybe if that > could be made clearer in the commit message, that would help. I think the argument is: wtf do we even look at metadata (or anything, really) before journal replay. The patch is actually good because it does move the read after the journal replay though? > >> >> >>> 22 июля 2019 г., в 8:29, NeilBrown <neilb@suse.com> написал(а): >>> >>> On Sun, Jul 21 2019, James Simmons wrote: >>> >>>> When ldiskfs run in failover mode whith read-only disk. >>>> Part of allocation updates are lost and ldiskfs may fail >>>> while mounting this is due to inconsistent state of >>>> group-descriptor. Group-descriptor check is added after >>>> journal replay. >>> >>> I think this needs to be enabled by a mount option or super-block flag. >>> >>> NeilBrown >>> >>> >>>> >>>> Signed-off-by: James Simmons <jsimmons@infradead.org> >>>> --- >>>> fs/ext4/super.c | 11 ++++++----- >>>> 1 file changed, 6 insertions(+), 5 deletions(-) >>>> >>>> diff --git a/fs/ext4/super.c b/fs/ext4/super.c >>>> index a3179b2..b818acb 100644 >>>> --- a/fs/ext4/super.c >>>> +++ b/fs/ext4/super.c >>>> @@ -4255,11 +4255,6 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent) >>>> } >>>> } >>>> sbi->s_gdb_count = db_count; >>>> - if (!ext4_check_descriptors(sb, logical_sb_block, &first_not_zeroed)) { >>>> - ext4_msg(sb, KERN_ERR, "group descriptors corrupted!"); >>>> - ret = -EFSCORRUPTED; >>>> - goto failed_mount2; >>>> - } >>>> >>>> timer_setup(&sbi->s_err_report, print_daily_error_info, 0); >>>> >>>> @@ -4401,6 +4396,12 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent) >>>> sbi->s_journal->j_commit_callback = ext4_journal_commit_callback; >>>> >>>> no_journal: >>>> + if (!ext4_check_descriptors(sb, logical_sb_block, &first_not_zeroed)) { >>>> + ext4_msg(sb, KERN_ERR, "group descriptors corrupted!"); >>>> + ret = -EFSCORRUPTED; >>>> + goto failed_mount_wq; >>>> + } >>>> + >>>> if (!test_opt(sb, NO_MBCACHE)) { >>>> sbi->s_ea_block_cache = ext4_xattr_create_cache(); >>>> if (!sbi->s_ea_block_cache) { >>>> -- >>>> 1.8.3.1 >>> _______________________________________________ >>> lustre-devel mailing list >>> lustre-devel@lists.lustre.org >>> http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org
> 22 июля 2019 г., в 9:56, Oleg Drokin <green@whamcloud.com> написал(а): > > > >> On Jul 22, 2019, at 2:46 AM, NeilBrown <neilb@suse.com> wrote: >> >> On Mon, Jul 22 2019, Alexey Lyashkov wrote: >> >>> Why? >>> Purpose of this patch is simple and don’t addressed a failover in general. >>> Crash can occurred in commit time - when journal _partially_ flushed to the FS. Checking any FS metadata in this time is wrong, because we have no guarantee to be consistence. >>> But checking an after journal replay is fine, it check have verification no corruption hit and FS is fine. >> >> If the corruption can occur in non-ldiskfs usage, and would be fixed by >> a journal replay, then yes - the patch looks like a good idea. Yes, bug should fixed by journal replay. BUT this check was run _before_ journal replay and patch move groups check _after_ journal replay done. >> >> Possibly I misunderstood the source of the corruption... maybe if that >> could be made clearer in the commit message, that would help. > > I think the argument is: wtf do we even look at metadata (or anything, really) > before journal replay. > The patch is actually good because it does move the read after the journal replay > though? Oleg, you are right. > > >> >>> >>> >>>> 22 июля 2019 г., в 8:29, NeilBrown <neilb@suse.com> написал(а): >>>> >>>> On Sun, Jul 21 2019, James Simmons wrote: >>>> >>>>> When ldiskfs run in failover mode whith read-only disk. >>>>> Part of allocation updates are lost and ldiskfs may fail >>>>> while mounting this is due to inconsistent state of >>>>> group-descriptor. Group-descriptor check is added after >>>>> journal replay. >>>> >>>> I think this needs to be enabled by a mount option or super-block flag. >>>> >>>> NeilBrown >>>> >>>> >>>>> >>>>> Signed-off-by: James Simmons <jsimmons@infradead.org> >>>>> --- >>>>> fs/ext4/super.c | 11 ++++++----- >>>>> 1 file changed, 6 insertions(+), 5 deletions(-) >>>>> >>>>> diff --git a/fs/ext4/super.c b/fs/ext4/super.c >>>>> index a3179b2..b818acb 100644 >>>>> --- a/fs/ext4/super.c >>>>> +++ b/fs/ext4/super.c >>>>> @@ -4255,11 +4255,6 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent) >>>>> } >>>>> } >>>>> sbi->s_gdb_count = db_count; >>>>> - if (!ext4_check_descriptors(sb, logical_sb_block, &first_not_zeroed)) { >>>>> - ext4_msg(sb, KERN_ERR, "group descriptors corrupted!"); >>>>> - ret = -EFSCORRUPTED; >>>>> - goto failed_mount2; >>>>> - } >>>>> >>>>> timer_setup(&sbi->s_err_report, print_daily_error_info, 0); >>>>> >>>>> @@ -4401,6 +4396,12 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent) >>>>> sbi->s_journal->j_commit_callback = ext4_journal_commit_callback; >>>>> >>>>> no_journal: >>>>> + if (!ext4_check_descriptors(sb, logical_sb_block, &first_not_zeroed)) { >>>>> + ext4_msg(sb, KERN_ERR, "group descriptors corrupted!"); >>>>> + ret = -EFSCORRUPTED; >>>>> + goto failed_mount_wq; >>>>> + } >>>>> + >>>>> if (!test_opt(sb, NO_MBCACHE)) { >>>>> sbi->s_ea_block_cache = ext4_xattr_create_cache(); >>>>> if (!sbi->s_ea_block_cache) { >>>>> -- >>>>> 1.8.3.1 >>>> _______________________________________________ >>>> lustre-devel mailing list >>>> lustre-devel@lists.lustre.org >>>> http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org > > _______________________________________________ > lustre-devel mailing list > lustre-devel@lists.lustre.org > http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org
Actually, I think this patch would be OK to push upstream. Cheers, Andreas > On Jul 21, 2019, at 23:29, NeilBrown <neilb@suse.com> wrote: > >> On Sun, Jul 21 2019, James Simmons wrote: >> >> When ldiskfs run in failover mode whith read-only disk. >> Part of allocation updates are lost and ldiskfs may fail >> while mounting this is due to inconsistent state of >> group-descriptor. Group-descriptor check is added after >> journal replay. > > I think this needs to be enabled by a mount option or super-block flag. > > NeilBrown > > >> >> Signed-off-by: James Simmons <jsimmons@infradead.org> >> --- >> fs/ext4/super.c | 11 ++++++----- >> 1 file changed, 6 insertions(+), 5 deletions(-) >> >> diff --git a/fs/ext4/super.c b/fs/ext4/super.c >> index a3179b2..b818acb 100644 >> --- a/fs/ext4/super.c >> +++ b/fs/ext4/super.c >> @@ -4255,11 +4255,6 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent) >> } >> } >> sbi->s_gdb_count = db_count; >> - if (!ext4_check_descriptors(sb, logical_sb_block, &first_not_zeroed)) { >> - ext4_msg(sb, KERN_ERR, "group descriptors corrupted!"); >> - ret = -EFSCORRUPTED; >> - goto failed_mount2; >> - } >> >> timer_setup(&sbi->s_err_report, print_daily_error_info, 0); >> >> @@ -4401,6 +4396,12 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent) >> sbi->s_journal->j_commit_callback = ext4_journal_commit_callback; >> >> no_journal: >> + if (!ext4_check_descriptors(sb, logical_sb_block, &first_not_zeroed)) { >> + ext4_msg(sb, KERN_ERR, "group descriptors corrupted!"); >> + ret = -EFSCORRUPTED; >> + goto failed_mount_wq; >> + } >> + >> if (!test_opt(sb, NO_MBCACHE)) { >> sbi->s_ea_block_cache = ext4_xattr_create_cache(); >> if (!sbi->s_ea_block_cache) { >> -- >> 1.8.3.1
what I think needs to happen is a better description. Something like: In a crash group descriptors might not be written completely in place that would lead to FS error message on subsequent mount. Move the check to after journal replay to ensure we are dealing with up to date (and hopefully correct) information before declaring the FS as bad. > On Jul 22, 2019, at 9:57 PM, Andreas Dilger <adilger@whamcloud.com> wrote: > > Actually, I think this patch would be OK to push upstream. > > Cheers, Andreas > >> On Jul 21, 2019, at 23:29, NeilBrown <neilb@suse.com> wrote: >> >>> On Sun, Jul 21 2019, James Simmons wrote: >>> >>> When ldiskfs run in failover mode whith read-only disk. >>> Part of allocation updates are lost and ldiskfs may fail >>> while mounting this is due to inconsistent state of >>> group-descriptor. Group-descriptor check is added after >>> journal replay. >> >> I think this needs to be enabled by a mount option or super-block flag. >> >> NeilBrown >> >> >>> >>> Signed-off-by: James Simmons <jsimmons@infradead.org> >>> --- >>> fs/ext4/super.c | 11 ++++++----- >>> 1 file changed, 6 insertions(+), 5 deletions(-) >>> >>> diff --git a/fs/ext4/super.c b/fs/ext4/super.c >>> index a3179b2..b818acb 100644 >>> --- a/fs/ext4/super.c >>> +++ b/fs/ext4/super.c >>> @@ -4255,11 +4255,6 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent) >>> } >>> } >>> sbi->s_gdb_count = db_count; >>> - if (!ext4_check_descriptors(sb, logical_sb_block, &first_not_zeroed)) { >>> - ext4_msg(sb, KERN_ERR, "group descriptors corrupted!"); >>> - ret = -EFSCORRUPTED; >>> - goto failed_mount2; >>> - } >>> >>> timer_setup(&sbi->s_err_report, print_daily_error_info, 0); >>> >>> @@ -4401,6 +4396,12 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent) >>> sbi->s_journal->j_commit_callback = ext4_journal_commit_callback; >>> >>> no_journal: >>> + if (!ext4_check_descriptors(sb, logical_sb_block, &first_not_zeroed)) { >>> + ext4_msg(sb, KERN_ERR, "group descriptors corrupted!"); >>> + ret = -EFSCORRUPTED; >>> + goto failed_mount_wq; >>> + } >>> + >>> if (!test_opt(sb, NO_MBCACHE)) { >>> sbi->s_ea_block_cache = ext4_xattr_create_cache(); >>> if (!sbi->s_ea_block_cache) { >>> -- >>> 1.8.3.1
diff --git a/fs/ext4/super.c b/fs/ext4/super.c index a3179b2..b818acb 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -4255,11 +4255,6 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent) } } sbi->s_gdb_count = db_count; - if (!ext4_check_descriptors(sb, logical_sb_block, &first_not_zeroed)) { - ext4_msg(sb, KERN_ERR, "group descriptors corrupted!"); - ret = -EFSCORRUPTED; - goto failed_mount2; - } timer_setup(&sbi->s_err_report, print_daily_error_info, 0); @@ -4401,6 +4396,12 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent) sbi->s_journal->j_commit_callback = ext4_journal_commit_callback; no_journal: + if (!ext4_check_descriptors(sb, logical_sb_block, &first_not_zeroed)) { + ext4_msg(sb, KERN_ERR, "group descriptors corrupted!"); + ret = -EFSCORRUPTED; + goto failed_mount_wq; + } + if (!test_opt(sb, NO_MBCACHE)) { sbi->s_ea_block_cache = ext4_xattr_create_cache(); if (!sbi->s_ea_block_cache) {
When ldiskfs run in failover mode whith read-only disk. Part of allocation updates are lost and ldiskfs may fail while mounting this is due to inconsistent state of group-descriptor. Group-descriptor check is added after journal replay. Signed-off-by: James Simmons <jsimmons@infradead.org> --- fs/ext4/super.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-)