Message ID | e71d3fbf-8b12-a650-e622-76bc9d4d6a98@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [V2] fs: avoid softlockups in s_inodes iterators | expand |
On Wed 16-10-19 12:11:24, Eric Sandeen wrote: > When a filesystem is unmounted, we currently call fsnotify_sb_delete() > before evict_inodes(), which means that fsnotify_unmount_inodes() > must iterate over all inodes on the superblock, even though it will > only act on inodes with a refcount. This is inefficient and can lead > to livelocks as it iterates over many unrefcounted inodes. > > However, since fsnotify_sb_delete() and evict_inodes() are working > on orthogonal sets of inodes (fsnotify_sb_delete() only cares about nonzero > refcount, and evict_inodes() only cares about zero refcount), we can swap > the order of the calls. The fsnotify call will then have a much smaller > list to walk (any refcounted inodes). > > This should speed things up overall, and avoid livelocks in > fsnotify_unmount_inodes(). > > Signed-off-by: Eric Sandeen <sandeen@redhat.com> Thanks for the patch. It looks good to me. Feel free to add: Reviewed-by: Jan Kara <jack@suse.cz> > --- > > I just did basic sanity testing here, but AFAIK there is no *notify > test suite, so I'm not sure how to really give this a workout. LTP has quite a few tests for inotify & fanotify. Honza > diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c > index ac9eb273e28c..426f03b6e660 100644 > --- a/fs/notify/fsnotify.c > +++ b/fs/notify/fsnotify.c > @@ -57,6 +57,10 @@ static void fsnotify_unmount_inodes(struct super_block *sb) > * doing an __iget/iput with SB_ACTIVE clear would actually > * evict all inodes with zero i_count from icache which is > * unnecessarily violent and may in fact be illegal to do. > + * > + * However, we should have been called /after/ evict_inodes > + * removed all zero refcount inodes, in any case. Test to > + * be sure. > */ > if (!atomic_read(&inode->i_count)) { > spin_unlock(&inode->i_lock); > diff --git a/fs/super.c b/fs/super.c > index cfadab2cbf35..cd352530eca9 100644 > --- a/fs/super.c > +++ b/fs/super.c > @@ -448,10 +448,12 @@ void generic_shutdown_super(struct super_block *sb) > sync_filesystem(sb); > sb->s_flags &= ~SB_ACTIVE; > > - fsnotify_sb_delete(sb); > cgroup_writeback_umount(); > > + /* evict all inodes with zero refcount */ > evict_inodes(sb); > + /* only nonzero refcount inodes can have marks */ > + fsnotify_sb_delete(sb); > > if (sb->s_dio_done_wq) { > destroy_workqueue(sb->s_dio_done_wq); > >
diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c index ac9eb273e28c..426f03b6e660 100644 --- a/fs/notify/fsnotify.c +++ b/fs/notify/fsnotify.c @@ -57,6 +57,10 @@ static void fsnotify_unmount_inodes(struct super_block *sb) * doing an __iget/iput with SB_ACTIVE clear would actually * evict all inodes with zero i_count from icache which is * unnecessarily violent and may in fact be illegal to do. + * + * However, we should have been called /after/ evict_inodes + * removed all zero refcount inodes, in any case. Test to + * be sure. */ if (!atomic_read(&inode->i_count)) { spin_unlock(&inode->i_lock); diff --git a/fs/super.c b/fs/super.c index cfadab2cbf35..cd352530eca9 100644 --- a/fs/super.c +++ b/fs/super.c @@ -448,10 +448,12 @@ void generic_shutdown_super(struct super_block *sb) sync_filesystem(sb); sb->s_flags &= ~SB_ACTIVE; - fsnotify_sb_delete(sb); cgroup_writeback_umount(); + /* evict all inodes with zero refcount */ evict_inodes(sb); + /* only nonzero refcount inodes can have marks */ + fsnotify_sb_delete(sb); if (sb->s_dio_done_wq) { destroy_workqueue(sb->s_dio_done_wq);
When a filesystem is unmounted, we currently call fsnotify_sb_delete() before evict_inodes(), which means that fsnotify_unmount_inodes() must iterate over all inodes on the superblock, even though it will only act on inodes with a refcount. This is inefficient and can lead to livelocks as it iterates over many unrefcounted inodes. However, since fsnotify_sb_delete() and evict_inodes() are working on orthogonal sets of inodes (fsnotify_sb_delete() only cares about nonzero refcount, and evict_inodes() only cares about zero refcount), we can swap the order of the calls. The fsnotify call will then have a much smaller list to walk (any refcounted inodes). This should speed things up overall, and avoid livelocks in fsnotify_unmount_inodes(). Signed-off-by: Eric Sandeen <sandeen@redhat.com> --- I just did basic sanity testing here, but AFAIK there is no *notify test suite, so I'm not sure how to really give this a workout.