Message ID | 1426871656-28032-9-git-send-email-jbacik@fb.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Sorry for a late reply. I was ill last week... On Fri 20-03-15 13:14:16, Josef Bacik wrote: > On a box with a lot of ram (148gb) I can make the box softlockup after running > an fs_mark job that creates hundreds of millions of empty files. This is > because we never generate enough memory pressure to keep the number of inodes on > our unused list low, so when we go to unmount we have to evict ~100 million > inodes. This makes one processor a very unhappy person, so add a cond_resched() > in dispose_list() and cond_resched_lock() in the eviction isolation function to > combat this. Thanks, > > Signed-off-by: Josef Bacik <jbacik@fb.com> > --- > fs/inode.c | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/fs/inode.c b/fs/inode.c > index b961e5a..c58dbd3 100644 > --- a/fs/inode.c > +++ b/fs/inode.c > @@ -574,6 +574,7 @@ static void dispose_list(struct list_head *head) > list_del_init(&inode->i_lru); > > evict(inode); > + cond_resched(); Fine. > } > } > > @@ -592,6 +593,7 @@ void evict_inodes(struct super_block *sb) > LIST_HEAD(dispose); > > spin_lock(&sb->s_inode_list_lock); > +again: > list_for_each_entry_safe(inode, next, &sb->s_inodes, i_sb_list) { > if (atomic_read(&inode->i_count)) > continue; > @@ -606,6 +608,14 @@ void evict_inodes(struct super_block *sb) > inode_lru_list_del(inode); > spin_unlock(&inode->i_lock); > list_add(&inode->i_lru, &dispose); > + > + /* > + * We can have a ton of inodes to evict at unmount time given > + * enough memory, check to see if we need to go to sleep for a > + * bit so we don't livelock. > + */ > + if (cond_resched_lock(&sb->s_inode_list_lock)) > + goto again; Not so fine. How this is ever guaranteed to finish? We don't move inodes from the i_sb_list in this loop so if we ever take 'goto again' we just start doing all the work from the beginning... What needs to happen is that if we need to resched, we drop sb->s_inode_list_lock, call dispose_list(&dispose) and *then* restart from the beginning since we have freed all the inodes that we isolated... Honza
On 04/01/2015 04:05 AM, Jan Kara wrote: > Sorry for a late reply. I was ill last week... > That's ok, I was on vacation for the last two weeks ;). > On Fri 20-03-15 13:14:16, Josef Bacik wrote: >> On a box with a lot of ram (148gb) I can make the box softlockup after running >> an fs_mark job that creates hundreds of millions of empty files. This is >> because we never generate enough memory pressure to keep the number of inodes on >> our unused list low, so when we go to unmount we have to evict ~100 million >> inodes. This makes one processor a very unhappy person, so add a cond_resched() >> in dispose_list() and cond_resched_lock() in the eviction isolation function to >> combat this. Thanks, >> >> Signed-off-by: Josef Bacik <jbacik@fb.com> >> --- >> fs/inode.c | 10 ++++++++++ >> 1 file changed, 10 insertions(+) >> >> diff --git a/fs/inode.c b/fs/inode.c >> index b961e5a..c58dbd3 100644 >> --- a/fs/inode.c >> +++ b/fs/inode.c >> @@ -574,6 +574,7 @@ static void dispose_list(struct list_head *head) >> list_del_init(&inode->i_lru); >> >> evict(inode); >> + cond_resched(); > Fine. > >> } >> } >> >> @@ -592,6 +593,7 @@ void evict_inodes(struct super_block *sb) >> LIST_HEAD(dispose); >> >> spin_lock(&sb->s_inode_list_lock); >> +again: >> list_for_each_entry_safe(inode, next, &sb->s_inodes, i_sb_list) { >> if (atomic_read(&inode->i_count)) >> continue; >> @@ -606,6 +608,14 @@ void evict_inodes(struct super_block *sb) >> inode_lru_list_del(inode); >> spin_unlock(&inode->i_lock); >> list_add(&inode->i_lru, &dispose); >> + >> + /* >> + * We can have a ton of inodes to evict at unmount time given >> + * enough memory, check to see if we need to go to sleep for a >> + * bit so we don't livelock. >> + */ >> + if (cond_resched_lock(&sb->s_inode_list_lock)) >> + goto again; > Not so fine. How this is ever guaranteed to finish? We don't move inodes > from the i_sb_list in this loop so if we ever take 'goto again' we just > start doing all the work from the beginning... > > What needs to happen is that if we need to resched, we drop > sb->s_inode_list_lock, call dispose_list(&dispose) and *then* restart from > the beginning since we have freed all the inodes that we isolated... > Ooops, good point. I'll get this fixed up, thanks, Josef -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/inode.c b/fs/inode.c index b961e5a..c58dbd3 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -574,6 +574,7 @@ static void dispose_list(struct list_head *head) list_del_init(&inode->i_lru); evict(inode); + cond_resched(); } } @@ -592,6 +593,7 @@ void evict_inodes(struct super_block *sb) LIST_HEAD(dispose); spin_lock(&sb->s_inode_list_lock); +again: list_for_each_entry_safe(inode, next, &sb->s_inodes, i_sb_list) { if (atomic_read(&inode->i_count)) continue; @@ -606,6 +608,14 @@ void evict_inodes(struct super_block *sb) inode_lru_list_del(inode); spin_unlock(&inode->i_lock); list_add(&inode->i_lru, &dispose); + + /* + * We can have a ton of inodes to evict at unmount time given + * enough memory, check to see if we need to go to sleep for a + * bit so we don't livelock. + */ + if (cond_resched_lock(&sb->s_inode_list_lock)) + goto again; } spin_unlock(&sb->s_inode_list_lock);
On a box with a lot of ram (148gb) I can make the box softlockup after running an fs_mark job that creates hundreds of millions of empty files. This is because we never generate enough memory pressure to keep the number of inodes on our unused list low, so when we go to unmount we have to evict ~100 million inodes. This makes one processor a very unhappy person, so add a cond_resched() in dispose_list() and cond_resched_lock() in the eviction isolation function to combat this. Thanks, Signed-off-by: Josef Bacik <jbacik@fb.com> --- fs/inode.c | 10 ++++++++++ 1 file changed, 10 insertions(+)