Message ID | 20211122064126.76734-1-ligang.bdlg@bytedance.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v1] shmem: change shrinklist_lock form spinlock to mutex and move iput into it | expand |
On Mon, Nov 22, 2021 at 2:41 PM Gang Li <ligang.bdlg@bytedance.com> wrote: > > This patch fixes commit 779750d20b93 ("shmem: split huge pages > beyond i_size under memory pressure"). > > iput out of sbinfo->shrinklist_lock will let shmem_evict_inode grab > and delete the inode, which will berak the consistency between > shrinklist_len and shrinklist. The simultaneous deletion of adjacent > elements in the local list "list" by shmem_unused_huge_shrink and > shmem_evict_inode will also break the list. > > iput must in lock or after lock, but shrinklist_lock is a spinlock > which can not sleep and iput may sleep.[1] > > Fix it by changing shrinklist_lock from spinlock to mutex and moving iput > into this lock. > > [1]. Link: http://lkml.kernel.org/r/20170131093141.GA15899@node.shutemov.name > Fixes: 779750d20b93 ("shmem: split huge pages beyond i_size under memory pressure") > Signed-off-by: Gang Li <ligang.bdlg@bytedance.com> > --- > include/linux/shmem_fs.h | 2 +- > mm/shmem.c | 16 +++++++--------- > 2 files changed, 8 insertions(+), 10 deletions(-) > > diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h > index 166158b6e917..65804fd264d0 100644 > --- a/include/linux/shmem_fs.h > +++ b/include/linux/shmem_fs.h > @@ -41,7 +41,7 @@ struct shmem_sb_info { > ino_t next_ino; /* The next per-sb inode number to use */ > ino_t __percpu *ino_batch; /* The next per-cpu inode number to use */ > struct mempolicy *mpol; /* default memory policy for mappings */ > - spinlock_t shrinklist_lock; /* Protects shrinklist */ > + struct mutex shrinklist_mutex;/* Protects shrinklist */ > struct list_head shrinklist; /* List of shinkable inodes */ > unsigned long shrinklist_len; /* Length of shrinklist */ > }; > diff --git a/mm/shmem.c b/mm/shmem.c > index 18f93c2d68f1..2165a28631c5 100644 > --- a/mm/shmem.c > +++ b/mm/shmem.c > @@ -559,7 +559,7 @@ static unsigned long shmem_unused_huge_shrink(struct shmem_sb_info *sbinfo, > if (list_empty(&sbinfo->shrinklist)) > return SHRINK_STOP; > > - spin_lock(&sbinfo->shrinklist_lock); > + mutex_lock(&sbinfo->shrinklist_mutex); > list_for_each_safe(pos, next, &sbinfo->shrinklist) { > info = list_entry(pos, struct shmem_inode_info, shrinklist); > > @@ -586,7 +586,6 @@ static unsigned long shmem_unused_huge_shrink(struct shmem_sb_info *sbinfo, > if (!--batch) > break; > } > - spin_unlock(&sbinfo->shrinklist_lock); > > list_for_each_safe(pos, next, &to_remove) { > info = list_entry(pos, struct shmem_inode_info, shrinklist); > @@ -643,10 +642,9 @@ static unsigned long shmem_unused_huge_shrink(struct shmem_sb_info *sbinfo, > iput(inode); It could lead to deadlock, since we could be the last user of @inode, then shmem_evict_inode() will be called and try to acquire the mutex lock. Notice that the mutex is already held here. Thanks. > } > > - spin_lock(&sbinfo->shrinklist_lock); > list_splice_tail(&list, &sbinfo->shrinklist); > sbinfo->shrinklist_len -= removed; > - spin_unlock(&sbinfo->shrinklist_lock); > + mutex_unlock(&sbinfo->shrinklist_mutex); > > return split; > } > @@ -1137,12 +1135,12 @@ static void shmem_evict_inode(struct inode *inode) > inode->i_size = 0; > shmem_truncate_range(inode, 0, (loff_t)-1); > if (!list_empty(&info->shrinklist)) { > - spin_lock(&sbinfo->shrinklist_lock); > + mutex_lock(&sbinfo->shrinklist_mutex); > if (!list_empty(&info->shrinklist)) { > list_del_init(&info->shrinklist); > sbinfo->shrinklist_len--; > } > - spin_unlock(&sbinfo->shrinklist_lock); > + mutex_unlock(&sbinfo->shrinklist_mutex); > } > while (!list_empty(&info->swaplist)) { > /* Wait while shmem_unuse() is scanning this inode... */ > @@ -1954,7 +1952,7 @@ static int shmem_getpage_gfp(struct inode *inode, pgoff_t index, > * Part of the huge page is beyond i_size: subject > * to shrink under memory pressure. > */ > - spin_lock(&sbinfo->shrinklist_lock); > + mutex_lock(&sbinfo->shrinklist_mutex); > /* > * _careful to defend against unlocked access to > * ->shrink_list in shmem_unused_huge_shrink() > @@ -1964,7 +1962,7 @@ static int shmem_getpage_gfp(struct inode *inode, pgoff_t index, > &sbinfo->shrinklist); > sbinfo->shrinklist_len++; > } > - spin_unlock(&sbinfo->shrinklist_lock); > + mutex_unlock(&sbinfo->shrinklist_mutex); > } > > /* > @@ -3652,7 +3650,7 @@ static int shmem_fill_super(struct super_block *sb, struct fs_context *fc) > raw_spin_lock_init(&sbinfo->stat_lock); > if (percpu_counter_init(&sbinfo->used_blocks, 0, GFP_KERNEL)) > goto failed; > - spin_lock_init(&sbinfo->shrinklist_lock); > + mutex_init(&sbinfo->shrinklist_mutex); > INIT_LIST_HEAD(&sbinfo->shrinklist); > > sb->s_maxbytes = MAX_LFS_FILESIZE; > -- > 2.20.1 >
Hi Gang, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on hnaz-mm/master] [also build test WARNING on linux/master linus/master v5.16-rc2 next-20211125] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Gang-Li/shmem-change-shrinklist_lock-form-spinlock-to-mutex-and-move-iput-into-it/20211122-144228 base: https://github.com/hnaz/linux-mm master config: i386-randconfig-m021-20211124 (https://download.01.org/0day-ci/archive/20211126/202111260701.YxF96BC5-lkp@intel.com/config) compiler: gcc-9 (Debian 9.3.0-22) 9.3.0 If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> smatch warnings: mm/shmem.c:1139 shmem_evict_inode() warn: inconsistent indenting vim +1139 mm/shmem.c ^1da177e4c3f41 Linus Torvalds 2005-04-16 1127 1f895f75dc0881 Al Viro 2010-06-05 1128 static void shmem_evict_inode(struct inode *inode) ^1da177e4c3f41 Linus Torvalds 2005-04-16 1129 { ^1da177e4c3f41 Linus Torvalds 2005-04-16 1130 struct shmem_inode_info *info = SHMEM_I(inode); 779750d20b93bb Kirill A. Shutemov 2016-07-26 1131 struct shmem_sb_info *sbinfo = SHMEM_SB(inode->i_sb); ^1da177e4c3f41 Linus Torvalds 2005-04-16 1132 30e6a51dbb0594 Hui Su 2020-12-14 1133 if (shmem_mapping(inode->i_mapping)) { ^1da177e4c3f41 Linus Torvalds 2005-04-16 1134 shmem_unacct_size(info->flags, inode->i_size); ^1da177e4c3f41 Linus Torvalds 2005-04-16 1135 inode->i_size = 0; 3889e6e76f66b7 Nicholas Piggin 2010-05-27 1136 shmem_truncate_range(inode, 0, (loff_t)-1); 779750d20b93bb Kirill A. Shutemov 2016-07-26 1137 if (!list_empty(&info->shrinklist)) { 713e6a98816b68 Gang Li 2021-11-22 1138 mutex_lock(&sbinfo->shrinklist_mutex); 779750d20b93bb Kirill A. Shutemov 2016-07-26 @1139 if (!list_empty(&info->shrinklist)) { 779750d20b93bb Kirill A. Shutemov 2016-07-26 1140 list_del_init(&info->shrinklist); 779750d20b93bb Kirill A. Shutemov 2016-07-26 1141 sbinfo->shrinklist_len--; 779750d20b93bb Kirill A. Shutemov 2016-07-26 1142 } 713e6a98816b68 Gang Li 2021-11-22 1143 mutex_unlock(&sbinfo->shrinklist_mutex); 779750d20b93bb Kirill A. Shutemov 2016-07-26 1144 } af53d3e9e04024 Hugh Dickins 2019-04-18 1145 while (!list_empty(&info->swaplist)) { af53d3e9e04024 Hugh Dickins 2019-04-18 1146 /* Wait while shmem_unuse() is scanning this inode... */ af53d3e9e04024 Hugh Dickins 2019-04-18 1147 wait_var_event(&info->stop_eviction, af53d3e9e04024 Hugh Dickins 2019-04-18 1148 !atomic_read(&info->stop_eviction)); cb5f7b9a47963d Hugh Dickins 2008-02-04 1149 mutex_lock(&shmem_swaplist_mutex); af53d3e9e04024 Hugh Dickins 2019-04-18 1150 /* ...but beware of the race if we peeked too early */ af53d3e9e04024 Hugh Dickins 2019-04-18 1151 if (!atomic_read(&info->stop_eviction)) ^1da177e4c3f41 Linus Torvalds 2005-04-16 1152 list_del_init(&info->swaplist); cb5f7b9a47963d Hugh Dickins 2008-02-04 1153 mutex_unlock(&shmem_swaplist_mutex); ^1da177e4c3f41 Linus Torvalds 2005-04-16 1154 } 3ed47db34f480d Al Viro 2016-01-22 1155 } b09e0fa4b4ea66 Eric Paris 2011-05-24 1156 38f38657444d15 Aristeu Rozanski 2012-08-23 1157 simple_xattrs_free(&info->xattrs); 0f3c42f522dc1a Hugh Dickins 2012-11-16 1158 WARN_ON(inode->i_blocks); 5b04c6890f0dc7 Pavel Emelyanov 2008-02-04 1159 shmem_free_inode(inode->i_sb); dbd5768f87ff6f Jan Kara 2012-05-03 1160 clear_inode(inode); ^1da177e4c3f41 Linus Torvalds 2005-04-16 1161 } ^1da177e4c3f41 Linus Torvalds 2005-04-16 1162 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h index 166158b6e917..65804fd264d0 100644 --- a/include/linux/shmem_fs.h +++ b/include/linux/shmem_fs.h @@ -41,7 +41,7 @@ struct shmem_sb_info { ino_t next_ino; /* The next per-sb inode number to use */ ino_t __percpu *ino_batch; /* The next per-cpu inode number to use */ struct mempolicy *mpol; /* default memory policy for mappings */ - spinlock_t shrinklist_lock; /* Protects shrinklist */ + struct mutex shrinklist_mutex;/* Protects shrinklist */ struct list_head shrinklist; /* List of shinkable inodes */ unsigned long shrinklist_len; /* Length of shrinklist */ }; diff --git a/mm/shmem.c b/mm/shmem.c index 18f93c2d68f1..2165a28631c5 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -559,7 +559,7 @@ static unsigned long shmem_unused_huge_shrink(struct shmem_sb_info *sbinfo, if (list_empty(&sbinfo->shrinklist)) return SHRINK_STOP; - spin_lock(&sbinfo->shrinklist_lock); + mutex_lock(&sbinfo->shrinklist_mutex); list_for_each_safe(pos, next, &sbinfo->shrinklist) { info = list_entry(pos, struct shmem_inode_info, shrinklist); @@ -586,7 +586,6 @@ static unsigned long shmem_unused_huge_shrink(struct shmem_sb_info *sbinfo, if (!--batch) break; } - spin_unlock(&sbinfo->shrinklist_lock); list_for_each_safe(pos, next, &to_remove) { info = list_entry(pos, struct shmem_inode_info, shrinklist); @@ -643,10 +642,9 @@ static unsigned long shmem_unused_huge_shrink(struct shmem_sb_info *sbinfo, iput(inode); } - spin_lock(&sbinfo->shrinklist_lock); list_splice_tail(&list, &sbinfo->shrinklist); sbinfo->shrinklist_len -= removed; - spin_unlock(&sbinfo->shrinklist_lock); + mutex_unlock(&sbinfo->shrinklist_mutex); return split; } @@ -1137,12 +1135,12 @@ static void shmem_evict_inode(struct inode *inode) inode->i_size = 0; shmem_truncate_range(inode, 0, (loff_t)-1); if (!list_empty(&info->shrinklist)) { - spin_lock(&sbinfo->shrinklist_lock); + mutex_lock(&sbinfo->shrinklist_mutex); if (!list_empty(&info->shrinklist)) { list_del_init(&info->shrinklist); sbinfo->shrinklist_len--; } - spin_unlock(&sbinfo->shrinklist_lock); + mutex_unlock(&sbinfo->shrinklist_mutex); } while (!list_empty(&info->swaplist)) { /* Wait while shmem_unuse() is scanning this inode... */ @@ -1954,7 +1952,7 @@ static int shmem_getpage_gfp(struct inode *inode, pgoff_t index, * Part of the huge page is beyond i_size: subject * to shrink under memory pressure. */ - spin_lock(&sbinfo->shrinklist_lock); + mutex_lock(&sbinfo->shrinklist_mutex); /* * _careful to defend against unlocked access to * ->shrink_list in shmem_unused_huge_shrink() @@ -1964,7 +1962,7 @@ static int shmem_getpage_gfp(struct inode *inode, pgoff_t index, &sbinfo->shrinklist); sbinfo->shrinklist_len++; } - spin_unlock(&sbinfo->shrinklist_lock); + mutex_unlock(&sbinfo->shrinklist_mutex); } /* @@ -3652,7 +3650,7 @@ static int shmem_fill_super(struct super_block *sb, struct fs_context *fc) raw_spin_lock_init(&sbinfo->stat_lock); if (percpu_counter_init(&sbinfo->used_blocks, 0, GFP_KERNEL)) goto failed; - spin_lock_init(&sbinfo->shrinklist_lock); + mutex_init(&sbinfo->shrinklist_mutex); INIT_LIST_HEAD(&sbinfo->shrinklist); sb->s_maxbytes = MAX_LFS_FILESIZE;
This patch fixes commit 779750d20b93 ("shmem: split huge pages beyond i_size under memory pressure"). iput out of sbinfo->shrinklist_lock will let shmem_evict_inode grab and delete the inode, which will berak the consistency between shrinklist_len and shrinklist. The simultaneous deletion of adjacent elements in the local list "list" by shmem_unused_huge_shrink and shmem_evict_inode will also break the list. iput must in lock or after lock, but shrinklist_lock is a spinlock which can not sleep and iput may sleep.[1] Fix it by changing shrinklist_lock from spinlock to mutex and moving iput into this lock. [1]. Link: http://lkml.kernel.org/r/20170131093141.GA15899@node.shutemov.name Fixes: 779750d20b93 ("shmem: split huge pages beyond i_size under memory pressure") Signed-off-by: Gang Li <ligang.bdlg@bytedance.com> --- include/linux/shmem_fs.h | 2 +- mm/shmem.c | 16 +++++++--------- 2 files changed, 8 insertions(+), 10 deletions(-)