Message ID | 20220413082957.28774-1-heming.zhao@suse.com (mailing list archive) |
---|---|
Headers | show |
Series | rewrite error handling during mounting stage | expand |
On 4/13/22 4:29 PM, Heming Zhao wrote: > After commit da5e7c87827e8 ("ocfs2: cleanup journal init and shutdown"), > journal init later than before, it makes NULL pointer access in free > routine. > > Crash flow: > > ocfs2_fill_super > + ocfs2_mount_volume > | + ocfs2_dlm_init //fail & return, osb->journal is NULL. > | + ... > | + ocfs2_check_volume //no chance to init osb->journal > | > + ... > + ocfs2_dismount_volume > ocfs2_release_system_inodes > ... > evict > ... > ocfs2_clear_inode > ocfs2_checkpoint_inode > ocfs2_ci_fully_checkpointed > time_after(journal->j_trans_id, ci->ci_last_trans) > + journal is empty, crash! > > For fixing, there are three solutions: > > 1> Partly revert commit da5e7c87827e8 > > For avoiding kernel crash, this make sense for us. We only concerned > whether there has any non-system inode access before dlm init. The > answer is NO. And all journal replay/recovery handling happen after > dlm & journal init done. So this method is not graceful but workable. > > 2> Add osb->journal check in free inode routine (eg ocfs2_clear_inode) > > The fix code is special for mounting phase, but it will continue > working after mounting stage. In another word, this method adds useless > code in normal inode free flow. > > 3> Do directly free inode in mounting phase > > This method is brutal/complex and may introduce unsafe code, currently > maintainer didn't like. > > At last, we chose method <1> and did partly reverted job. > We reverted journal init codes, and kept cleanup codes flow. > > Fixes: da5e7c87827e8 ("ocfs2: cleanup journal init and shutdown") > Signed-off-by: Heming Zhao <heming.zhao@suse.com> > --- > fs/ocfs2/inode.c | 4 ++-- > fs/ocfs2/journal.c | 32 ++++++++++++++++++++++---------- > fs/ocfs2/super.c | 16 ++++++++++++++++ > 3 files changed, 40 insertions(+), 12 deletions(-) > > diff --git a/fs/ocfs2/inode.c b/fs/ocfs2/inode.c > index 5739dc301569..bb116c39b581 100644 > --- a/fs/ocfs2/inode.c > +++ b/fs/ocfs2/inode.c > @@ -125,6 +125,7 @@ struct inode *ocfs2_iget(struct ocfs2_super *osb, u64 blkno, unsigned flags, > struct inode *inode = NULL; > struct super_block *sb = osb->sb; > struct ocfs2_find_inode_args args; > + journal_t *journal = osb->journal->j_journal; > > trace_ocfs2_iget_begin((unsigned long long)blkno, flags, > sysfile_type); > @@ -171,11 +172,10 @@ struct inode *ocfs2_iget(struct ocfs2_super *osb, u64 blkno, unsigned flags, > * part of the transaction - the inode could have been reclaimed and > * now it is reread from disk. > */ > - if (osb->journal) { > + if (journal) { > transaction_t *transaction; > tid_t tid; > struct ocfs2_inode_info *oi = OCFS2_I(inode); > - journal_t *journal = osb->journal->j_journal; > > read_lock(&journal->j_state_lock); > if (journal->j_running_transaction) > diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c > index 1887a2708709..49255fddce45 100644 > --- a/fs/ocfs2/journal.c > +++ b/fs/ocfs2/journal.c > @@ -810,22 +810,20 @@ void ocfs2_set_journal_params(struct ocfs2_super *osb) > write_unlock(&journal->j_state_lock); > } > > -int ocfs2_journal_init(struct ocfs2_super *osb, int *dirty) > +/* > + * alloc & initialize skeleton for journal structure. > + * ocfs2_journal_init() will make fs have journal ability. > + */ > +int ocfs2_journal_alloc(struct ocfs2_super *osb) > { > - int status = -1; > - struct inode *inode = NULL; /* the journal inode */ > - journal_t *j_journal = NULL; > - struct ocfs2_journal *journal = NULL; > - struct ocfs2_dinode *di = NULL; > - struct buffer_head *bh = NULL; > - int inode_lock = 0; > + int status = 0; > + struct ocfs2_journal *journal; > > - /* initialize our journal structure */ > journal = kzalloc(sizeof(struct ocfs2_journal), GFP_KERNEL); > if (!journal) { > mlog(ML_ERROR, "unable to alloc journal\n"); > status = -ENOMEM; > - goto done; > + goto bail; > } > osb->journal = journal; > journal->j_osb = osb; > @@ -839,6 +837,20 @@ int ocfs2_journal_init(struct ocfs2_super *osb, int *dirty) > INIT_WORK(&journal->j_recovery_work, ocfs2_complete_recovery); > journal->j_state = OCFS2_JOURNAL_FREE; > > +bail: > + return status; > +} > + > +int ocfs2_journal_init(struct ocfs2_super *osb, int *dirty) > +{ > + int status = -1; > + struct inode *inode = NULL; /* the journal inode */ > + journal_t *j_journal = NULL; > + struct ocfs2_journal *journal = osb->journal; > + struct ocfs2_dinode *di = NULL; > + struct buffer_head *bh = NULL; > + int inode_lock = 0; > + Better to leave a BUG_ON(!journal) here like before. > /* already have the inode for our journal */ > inode = ocfs2_get_system_file_inode(osb, JOURNAL_SYSTEM_INODE, > osb->slot_num); > diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c > index 477cdf94122e..babec2c9d638 100644 > --- a/fs/ocfs2/super.c > +++ b/fs/ocfs2/super.c > @@ -2015,6 +2015,7 @@ static int ocfs2_initialize_super(struct super_block *sb, > int i, cbits, bbits; > struct ocfs2_dinode *di = (struct ocfs2_dinode *)bh->b_data; > struct inode *inode = NULL; > + struct ocfs2_journal *journal; > struct ocfs2_super *osb; > u64 total_blocks; > > @@ -2195,6 +2196,15 @@ static int ocfs2_initialize_super(struct super_block *sb, > > get_random_bytes(&osb->s_next_generation, sizeof(u32)); > > + /* > + * FIXME > + * This should be done in ocfs2_journal_init(), but any inode > + * writes back operation will cause the filesystem to crash. > + */ > + status = ocfs2_journal_alloc(osb); > + if (status) Explicitly mark 'status < 0'? Thanks, Joseph > + goto bail; > + > INIT_WORK(&osb->dquot_drop_work, ocfs2_drop_dquot_refs); > init_llist_head(&osb->dquot_drop_list); > > @@ -2483,6 +2493,12 @@ static void ocfs2_delete_osb(struct ocfs2_super *osb) > > kfree(osb->osb_orphan_wipes); > kfree(osb->slot_recovery_generations); > + /* FIXME > + * This belongs in journal shutdown, but because we have to > + * allocate osb->journal at the middle of ocfs2_initialize_super(), > + * we free it here. > + */ > + kfree(osb->journal); > kfree(osb->local_alloc_copy); > kfree(osb->uuid_str); > kfree(osb->vol_label);
On 4/13/22 4:29 PM, Heming Zhao wrote: > After this patch, when error, ocfs2_fill_super doesn't take care to > release resources which are allocated in ocfs2_initialize_super. > > Signed-off-by: Heming Zhao <heming.zhao@suse.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> > --- > fs/ocfs2/super.c | 59 +++++++++++++++++++++++++++++++++--------------- > 1 file changed, 41 insertions(+), 18 deletions(-) > > diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c > index 7076125f5b46..4302c3e9598c 100644 > --- a/fs/ocfs2/super.c > +++ b/fs/ocfs2/super.c > @@ -2023,7 +2023,7 @@ static int ocfs2_initialize_super(struct super_block *sb, > if (!osb) { > status = -ENOMEM; > mlog_errno(status); > - goto bail; > + goto out; > } > > sb->s_fs_info = osb; > @@ -2084,7 +2084,7 @@ static int ocfs2_initialize_super(struct super_block *sb, > mlog(ML_ERROR, "Invalid number of node slots (%u)\n", > osb->max_slots); > status = -EINVAL; > - goto bail; > + goto out; > } > > ocfs2_orphan_scan_init(osb); > @@ -2093,7 +2093,7 @@ static int ocfs2_initialize_super(struct super_block *sb, > if (status) { > mlog(ML_ERROR, "Unable to initialize recovery state\n"); > mlog_errno(status); > - goto bail; > + goto out; > } > > init_waitqueue_head(&osb->checkpoint_event); > @@ -2117,7 +2117,7 @@ static int ocfs2_initialize_super(struct super_block *sb, > if (!osb->vol_label) { > mlog(ML_ERROR, "unable to alloc vol label\n"); > status = -ENOMEM; > - goto bail; > + goto out_recovery_map; > } > > osb->slot_recovery_generations = > @@ -2126,7 +2126,7 @@ static int ocfs2_initialize_super(struct super_block *sb, > if (!osb->slot_recovery_generations) { > status = -ENOMEM; > mlog_errno(status); > - goto bail; > + goto out_vol_label; > } > > init_waitqueue_head(&osb->osb_wipe_event); > @@ -2136,7 +2136,7 @@ static int ocfs2_initialize_super(struct super_block *sb, > if (!osb->osb_orphan_wipes) { > status = -ENOMEM; > mlog_errno(status); > - goto bail; > + goto out_slot_recovery_gen; > } > > osb->osb_rf_lock_tree = RB_ROOT; > @@ -2152,13 +2152,13 @@ static int ocfs2_initialize_super(struct super_block *sb, > mlog(ML_ERROR, "couldn't mount because of unsupported " > "optional features (%x).\n", i); > status = -EINVAL; > - goto bail; > + goto out_orphan_wipes; > } > if (!sb_rdonly(osb->sb) && (i = OCFS2_HAS_RO_COMPAT_FEATURE(osb->sb, ~OCFS2_FEATURE_RO_COMPAT_SUPP))) { > mlog(ML_ERROR, "couldn't mount RDWR because of " > "unsupported optional features (%x).\n", i); > status = -EINVAL; > - goto bail; > + goto out_orphan_wipes; > } > > if (ocfs2_clusterinfo_valid(osb)) { > @@ -2179,7 +2179,7 @@ static int ocfs2_initialize_super(struct super_block *sb, > "cluster stack label (%s) \n", > osb->osb_cluster_stack); > status = -EINVAL; > - goto bail; > + goto out_orphan_wipes; > } > memcpy(osb->osb_cluster_name, > OCFS2_RAW_SB(di)->s_cluster_info.ci_cluster, > @@ -2199,7 +2199,7 @@ static int ocfs2_initialize_super(struct super_block *sb, > */ > status = ocfs2_journal_alloc(osb); > if (status) > - goto bail; > + goto out_orphan_wipes; > > INIT_WORK(&osb->dquot_drop_work, ocfs2_drop_dquot_refs); > init_llist_head(&osb->dquot_drop_list); > @@ -2214,7 +2214,7 @@ static int ocfs2_initialize_super(struct super_block *sb, > mlog(ML_ERROR, "Volume has invalid cluster size (%d)\n", > osb->s_clustersize); > status = -EINVAL; > - goto bail; > + goto out_journal; > } > > total_blocks = ocfs2_clusters_to_blocks(osb->sb, > @@ -2226,14 +2226,14 @@ static int ocfs2_initialize_super(struct super_block *sb, > mlog(ML_ERROR, "Volume too large " > "to mount safely on this system"); > status = -EFBIG; > - goto bail; > + goto out_journal; > } > > if (ocfs2_setup_osb_uuid(osb, di->id2.i_super.s_uuid, > sizeof(di->id2.i_super.s_uuid))) { > mlog(ML_ERROR, "Out of memory trying to setup our uuid.\n"); > status = -ENOMEM; > - goto bail; > + goto out_journal; > } > > strlcpy(osb->vol_label, di->id2.i_super.s_label, > @@ -2253,7 +2253,7 @@ static int ocfs2_initialize_super(struct super_block *sb, > if (!osb->osb_dlm_debug) { > status = -ENOMEM; > mlog_errno(status); > - goto bail; > + goto out_uuid_str; > } > > atomic_set(&osb->vol_state, VOLUME_INIT); > @@ -2262,7 +2262,7 @@ static int ocfs2_initialize_super(struct super_block *sb, > status = ocfs2_init_global_system_inodes(osb); > if (status < 0) { > mlog_errno(status); > - goto bail; > + goto out_dlm_out; > } > > /* > @@ -2273,7 +2273,7 @@ static int ocfs2_initialize_super(struct super_block *sb, > if (!inode) { > status = -EINVAL; > mlog_errno(status); > - goto bail; > + goto out_system_inodes; > } > > osb->bitmap_blkno = OCFS2_I(inode)->ip_blkno; > @@ -2286,16 +2286,39 @@ static int ocfs2_initialize_super(struct super_block *sb, > status = ocfs2_init_slot_info(osb); > if (status < 0) { > mlog_errno(status); > - goto bail; > + goto out_system_inodes; > } > > osb->ocfs2_wq = alloc_ordered_workqueue("ocfs2_wq", WQ_MEM_RECLAIM); > if (!osb->ocfs2_wq) { > status = -ENOMEM; > mlog_errno(status); > + goto out_slot_info; > } > > -bail: > + return status; > + > +out_slot_info: > + ocfs2_free_slot_info(osb); > +out_system_inodes: > + ocfs2_release_system_inodes(osb); > +out_dlm_out: > + ocfs2_put_dlm_debug(osb->osb_dlm_debug); > +out_uuid_str: > + kfree(osb->uuid_str); > +out_journal: > + kfree(osb->journal); > +out_orphan_wipes: > + kfree(osb->osb_orphan_wipes); > +out_slot_recovery_gen: > + kfree(osb->slot_recovery_generations); > +out_vol_label: > + kfree(osb->vol_label); > +out_recovery_map: > + kfree(osb->recovery_map); > +out: > + kfree(osb); > + sb->s_fs_info = NULL; > return status; > } >
On Wed, Apr 13, 2022 at 07:37:59PM +0800, Joseph Qi wrote: > It mostly looks good to me. > Have you done error injection test after this patchset applied? > I'm afraid ocfs2-test can't cover this as of now. > I plan to write a debugfs api for error injection test. The code only for these patches test My idea: echo N > /sys/kernel/debug/ocfs2/xxx/error N: 1,2,3,4,... "echo N" will make ocfs2 trigger Nth error point become true: ``` status = ocfs2_initialize_super(sb, bh, sector_size, &stats); brelse(bh); bh = NULL; - if (status < 0) + if (error_inject(n)) //eg. "echo 1 > xx" will trigger there "if()" becomes true. goto out; ``` What's your opinion about my idea? Or do you have other method to test? - Heming > Thanks, > Joseph > > On 4/13/22 4:29 PM, Heming Zhao wrote: > > Current ocfs2_fill_super() uses one goto label "read_super_error" to > > handle all error cases. And with previous serial patches, the error > > handling should fork more branches to handle different error cases. > > This patch rewrite the error handling of ocfs2_fill_super. > > > > Signed-off-by: Heming Zhao <heming.zhao@suse.com> > > --- > > fs/ocfs2/super.c | 67 +++++++++++++++++++++++------------------------- > > 1 file changed, 32 insertions(+), 35 deletions(-) > > > > diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c > > index 5e860d7162d7..72673d40d29c 100644 > > --- a/fs/ocfs2/super.c > > +++ b/fs/ocfs2/super.c > > @@ -989,28 +989,27 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent) > > > > if (!ocfs2_parse_options(sb, data, &parsed_options, 0)) { > > status = -EINVAL; > > - goto read_super_error; > > + goto out; > > } > > > > /* probe for superblock */ > > status = ocfs2_sb_probe(sb, &bh, §or_size, &stats); > > if (status < 0) { > > mlog(ML_ERROR, "superblock probe failed!\n"); > > - goto read_super_error; > > + goto out; > > } > > > > status = ocfs2_initialize_super(sb, bh, sector_size, &stats); > > - osb = OCFS2_SB(sb); > > - if (status < 0) { > > - mlog_errno(status); > > - goto read_super_error; > > - } > > brelse(bh); > > bh = NULL; > > + if (status < 0) > > + goto out; > > + > > + osb = OCFS2_SB(sb); > > > > if (!ocfs2_check_set_options(sb, &parsed_options)) { > > status = -EINVAL; > > - goto read_super_error; > > + goto out_super; > > } > > osb->s_mount_opt = parsed_options.mount_opt; > > osb->s_atime_quantum = parsed_options.atime_quantum; > > @@ -1027,7 +1026,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent) > > > > status = ocfs2_verify_userspace_stack(osb, &parsed_options); > > if (status) > > - goto read_super_error; > > + goto out_super; > > > > sb->s_magic = OCFS2_SUPER_MAGIC; > > > > @@ -1041,7 +1040,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent) > > status = -EACCES; > > mlog(ML_ERROR, "Readonly device detected but readonly " > > "mount was not specified.\n"); > > - goto read_super_error; > > + goto out_super; > > } > > > > /* You should not be able to start a local heartbeat > > @@ -1050,7 +1049,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent) > > status = -EROFS; > > mlog(ML_ERROR, "Local heartbeat specified on readonly " > > "device.\n"); > > - goto read_super_error; > > + goto out_super; > > } > > > > status = ocfs2_check_journals_nolocks(osb); > > @@ -1059,9 +1058,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent) > > mlog(ML_ERROR, "Recovery required on readonly " > > "file system, but write access is " > > "unavailable.\n"); > > - else > > - mlog_errno(status); > > - goto read_super_error; > > + goto out_super; > > } > > > > ocfs2_set_ro_flag(osb, 1); > > @@ -1077,10 +1074,8 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent) > > } > > > > status = ocfs2_verify_heartbeat(osb); > > - if (status < 0) { > > - mlog_errno(status); > > - goto read_super_error; > > - } > > + if (status < 0) > > + goto out_super; > > > > osb->osb_debug_root = debugfs_create_dir(osb->uuid_str, > > ocfs2_debugfs_root); > > @@ -1094,15 +1089,14 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent) > > > > status = ocfs2_mount_volume(sb); > > if (status < 0) > > - goto read_super_error; > > + goto out_debugfs; > > > > if (osb->root_inode) > > inode = igrab(osb->root_inode); > > > > if (!inode) { > > status = -EIO; > > - mlog_errno(status); > > - goto read_super_error; > > + goto out_dismount; > > } > > > > osb->osb_dev_kset = kset_create_and_add(sb->s_id, NULL, > > @@ -1110,7 +1104,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent) > > if (!osb->osb_dev_kset) { > > status = -ENOMEM; > > mlog(ML_ERROR, "Unable to create device kset %s.\n", sb->s_id); > > - goto read_super_error; > > + goto out_dismount; > > } > > > > /* Create filecheck sysfs related directories/files at > > @@ -1119,14 +1113,13 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent) > > status = -ENOMEM; > > mlog(ML_ERROR, "Unable to create filecheck sysfs directory at " > > "/sys/fs/ocfs2/%s/filecheck.\n", sb->s_id); > > - goto read_super_error; > > + goto out_dismount; > > } > > > > root = d_make_root(inode); > > if (!root) { > > status = -ENOMEM; > > - mlog_errno(status); > > - goto read_super_error; > > + goto out_dismount; > > } > > > > sb->s_root = root; > > @@ -1178,17 +1171,21 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent) > > > > return status; > > > > -read_super_error: > > - brelse(bh); > > - > > - if (status) > > - mlog_errno(status); > > +out_dismount: > > + atomic_set(&osb->vol_state, VOLUME_DISABLED); > > + wake_up(&osb->osb_mount_event); > > + ocfs2_dismount_volume(sb, 1); > > + goto out; > > > > - if (osb) { > > - atomic_set(&osb->vol_state, VOLUME_DISABLED); > > - wake_up(&osb->osb_mount_event); > > - ocfs2_dismount_volume(sb, 1); > > - } > > +out_debugfs: > > + debugfs_remove_recursive(osb->osb_debug_root); > > +out_super: > > + ocfs2_release_system_inodes(osb); > > + kfree(osb->recovery_map); > > + ocfs2_delete_osb(osb); > > + kfree(osb); > > +out: > > + mlog_errno(status); > > > > return status; > > } >
On 4/13/22 18:54, Joseph Qi wrote: > > > On 4/13/22 4:29 PM, Heming Zhao wrote: >> After commit da5e7c87827e8 ("ocfs2: cleanup journal init and shutdown"), >> journal init later than before, it makes NULL pointer access in free >> routine. >> >> Crash flow: >> >> ocfs2_fill_super >> + ocfs2_mount_volume >> | + ocfs2_dlm_init //fail & return, osb->journal is NULL. >> | + ... >> | + ocfs2_check_volume //no chance to init osb->journal >> | >> + ... >> + ocfs2_dismount_volume >> ocfs2_release_system_inodes >> ... >> evict >> ... >> ocfs2_clear_inode >> ocfs2_checkpoint_inode >> ocfs2_ci_fully_checkpointed >> time_after(journal->j_trans_id, ci->ci_last_trans) >> + journal is empty, crash! >> >> For fixing, there are three solutions: >> >> 1> Partly revert commit da5e7c87827e8 >> >> For avoiding kernel crash, this make sense for us. We only concerned >> whether there has any non-system inode access before dlm init. The >> answer is NO. And all journal replay/recovery handling happen after >> dlm & journal init done. So this method is not graceful but workable. >> >> 2> Add osb->journal check in free inode routine (eg ocfs2_clear_inode) >> >> The fix code is special for mounting phase, but it will continue >> working after mounting stage. In another word, this method adds useless >> code in normal inode free flow. >> >> 3> Do directly free inode in mounting phase >> >> This method is brutal/complex and may introduce unsafe code, currently >> maintainer didn't like. >> >> At last, we chose method <1> and did partly reverted job. >> We reverted journal init codes, and kept cleanup codes flow. >> >> Fixes: da5e7c87827e8 ("ocfs2: cleanup journal init and shutdown") >> Signed-off-by: Heming Zhao <heming.zhao@suse.com> >> --- >> fs/ocfs2/inode.c | 4 ++-- >> fs/ocfs2/journal.c | 32 ++++++++++++++++++++++---------- >> fs/ocfs2/super.c | 16 ++++++++++++++++ >> 3 files changed, 40 insertions(+), 12 deletions(-) >> >> diff --git a/fs/ocfs2/inode.c b/fs/ocfs2/inode.c >> index 5739dc301569..bb116c39b581 100644 >> --- a/fs/ocfs2/inode.c >> +++ b/fs/ocfs2/inode.c >> @@ -125,6 +125,7 @@ struct inode *ocfs2_iget(struct ocfs2_super *osb, u64 blkno, unsigned flags, >> struct inode *inode = NULL; >> struct super_block *sb = osb->sb; >> struct ocfs2_find_inode_args args; >> + journal_t *journal = osb->journal->j_journal; >> >> trace_ocfs2_iget_begin((unsigned long long)blkno, flags, >> sysfile_type); >> @@ -171,11 +172,10 @@ struct inode *ocfs2_iget(struct ocfs2_super *osb, u64 blkno, unsigned flags, >> * part of the transaction - the inode could have been reclaimed and >> * now it is reread from disk. >> */ >> - if (osb->journal) { >> + if (journal) { >> transaction_t *transaction; >> tid_t tid; >> struct ocfs2_inode_info *oi = OCFS2_I(inode); >> - journal_t *journal = osb->journal->j_journal; >> >> read_lock(&journal->j_state_lock); >> if (journal->j_running_transaction) >> diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c >> index 1887a2708709..49255fddce45 100644 >> --- a/fs/ocfs2/journal.c >> +++ b/fs/ocfs2/journal.c >> @@ -810,22 +810,20 @@ void ocfs2_set_journal_params(struct ocfs2_super *osb) >> write_unlock(&journal->j_state_lock); >> } >> >> -int ocfs2_journal_init(struct ocfs2_super *osb, int *dirty) >> +/* >> + * alloc & initialize skeleton for journal structure. >> + * ocfs2_journal_init() will make fs have journal ability. >> + */ >> +int ocfs2_journal_alloc(struct ocfs2_super *osb) >> { >> - int status = -1; >> - struct inode *inode = NULL; /* the journal inode */ >> - journal_t *j_journal = NULL; >> - struct ocfs2_journal *journal = NULL; >> - struct ocfs2_dinode *di = NULL; >> - struct buffer_head *bh = NULL; >> - int inode_lock = 0; >> + int status = 0; >> + struct ocfs2_journal *journal; >> >> - /* initialize our journal structure */ >> journal = kzalloc(sizeof(struct ocfs2_journal), GFP_KERNEL); >> if (!journal) { >> mlog(ML_ERROR, "unable to alloc journal\n"); >> status = -ENOMEM; >> - goto done; >> + goto bail; >> } >> osb->journal = journal; >> journal->j_osb = osb; >> @@ -839,6 +837,20 @@ int ocfs2_journal_init(struct ocfs2_super *osb, int *dirty) >> INIT_WORK(&journal->j_recovery_work, ocfs2_complete_recovery); >> journal->j_state = OCFS2_JOURNAL_FREE; >> >> +bail: >> + return status; >> +} >> + >> +int ocfs2_journal_init(struct ocfs2_super *osb, int *dirty) >> +{ >> + int status = -1; >> + struct inode *inode = NULL; /* the journal inode */ >> + journal_t *j_journal = NULL; >> + struct ocfs2_journal *journal = osb->journal; >> + struct ocfs2_dinode *di = NULL; >> + struct buffer_head *bh = NULL; >> + int inode_lock = 0; >> + > > Better to leave a BUG_ON(!journal) here like before. OK. restore BUG_ON(). The reason I removed it: In theory it's absolutely impossible become NULL. > >> /* already have the inode for our journal */ >> inode = ocfs2_get_system_file_inode(osb, JOURNAL_SYSTEM_INODE, >> osb->slot_num); >> diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c >> index 477cdf94122e..babec2c9d638 100644 >> --- a/fs/ocfs2/super.c >> +++ b/fs/ocfs2/super.c >> @@ -2015,6 +2015,7 @@ static int ocfs2_initialize_super(struct super_block *sb, >> int i, cbits, bbits; >> struct ocfs2_dinode *di = (struct ocfs2_dinode *)bh->b_data; >> struct inode *inode = NULL; >> + struct ocfs2_journal *journal; >> struct ocfs2_super *osb; >> u64 total_blocks; >> >> @@ -2195,6 +2196,15 @@ static int ocfs2_initialize_super(struct super_block *sb, >> >> get_random_bytes(&osb->s_next_generation, sizeof(u32)); >> >> + /* >> + * FIXME >> + * This should be done in ocfs2_journal_init(), but any inode >> + * writes back operation will cause the filesystem to crash. >> + */ >> + status = ocfs2_journal_alloc(osb); >> + if (status) > > Explicitly mark 'status < 0'? > Sorry, my stupid mistake. I didn't compile & test v2 patch, and plan to test before v3. For test, please check my comment in patch 5/5 - heming > >> + goto bail; >> + >> INIT_WORK(&osb->dquot_drop_work, ocfs2_drop_dquot_refs); >> init_llist_head(&osb->dquot_drop_list); >> >> @@ -2483,6 +2493,12 @@ static void ocfs2_delete_osb(struct ocfs2_super *osb) >> >> kfree(osb->osb_orphan_wipes); >> kfree(osb->slot_recovery_generations); >> + /* FIXME >> + * This belongs in journal shutdown, but because we have to >> + * allocate osb->journal at the middle of ocfs2_initialize_super(), >> + * we free it here. >> + */ >> + kfree(osb->journal); >> kfree(osb->local_alloc_copy); >> kfree(osb->uuid_str); >> kfree(osb->vol_label); >
On 4/13/22 9:11 PM, Heming Zhao wrote: > On Wed, Apr 13, 2022 at 07:37:59PM +0800, Joseph Qi wrote: >> It mostly looks good to me. >> Have you done error injection test after this patchset applied? >> I'm afraid ocfs2-test can't cover this as of now. >> > > I plan to write a debugfs api for error injection test. > The code only for these patches test > > My idea: > echo N > /sys/kernel/debug/ocfs2/xxx/error > N: 1,2,3,4,... > > "echo N" will make ocfs2 trigger Nth error point become true: > > ``` > status = ocfs2_initialize_super(sb, bh, sector_size, &stats); > brelse(bh); > bh = NULL; > - if (status < 0) > + if (error_inject(n)) //eg. "echo 1 > xx" will trigger there "if()" becomes true. > goto out; > ``` > > What's your opinion about my idea? > Or do you have other method to test? Good idea, actually we did it in this way before since existing fault-injection can only inject common failures such as slab allocation. Thanks, Joseph