Message ID | 20190619062424.3486524-7-songliubraving@fb.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Enable THP for text section of non-shmem files | expand |
On Tue, 2019-06-18 at 23:24 -0700, Song Liu wrote: > index 8563339041f6..bab8d9eef46c 100644 > --- a/mm/truncate.c > +++ b/mm/truncate.c > @@ -790,7 +790,11 @@ EXPORT_SYMBOL_GPL(invalidate_inode_pages2); > void truncate_pagecache(struct inode *inode, loff_t newsize) > { > struct address_space *mapping = inode->i_mapping; > - loff_t holebegin = round_up(newsize, PAGE_SIZE); > + loff_t holebegin; > + > + /* if non-shmem file has thp, truncate the whole file */ > + if (filemap_nr_thps(mapping)) > + newsize = 0; > I don't get it. Sometimes truncate is used to increase the size of a file, or to change it to a non-zero size. Won't forcing the newsize to zero break applications, when the file is truncated to a different size than they expect?
> On Jun 19, 2019, at 6:39 PM, Rik van Riel <riel@fb.com> wrote: > > On Tue, 2019-06-18 at 23:24 -0700, Song Liu wrote: > >> index 8563339041f6..bab8d9eef46c 100644 >> --- a/mm/truncate.c >> +++ b/mm/truncate.c >> @@ -790,7 +790,11 @@ EXPORT_SYMBOL_GPL(invalidate_inode_pages2); >> void truncate_pagecache(struct inode *inode, loff_t newsize) >> { >> struct address_space *mapping = inode->i_mapping; >> - loff_t holebegin = round_up(newsize, PAGE_SIZE); >> + loff_t holebegin; >> + >> + /* if non-shmem file has thp, truncate the whole file */ >> + if (filemap_nr_thps(mapping)) >> + newsize = 0; >> > > I don't get it. Sometimes truncate is used to > increase the size of a file, or to change it > to a non-zero size. > > Won't forcing the newsize to zero break applications, > when the file is truncated to a different size than > they expect? This is not truncate the file. It only drops page cache. truncate_setsize() will still set correct size. I don't think this breaks anything. We can probably make it smarter and only drop the clean huge pages (dirty page should not exist). Thanks, Song
On Wed, 2019-06-19 at 22:10 -0400, Song Liu wrote: > > On Jun 19, 2019, at 6:39 PM, Rik van Riel <riel@fb.com> wrote: > > > > On Tue, 2019-06-18 at 23:24 -0700, Song Liu wrote: > > > > > index 8563339041f6..bab8d9eef46c 100644 > > > --- a/mm/truncate.c > > > +++ b/mm/truncate.c > > > @@ -790,7 +790,11 @@ EXPORT_SYMBOL_GPL(invalidate_inode_pages2); > > > void truncate_pagecache(struct inode *inode, loff_t newsize) > > > { > > > struct address_space *mapping = inode->i_mapping; > > > - loff_t holebegin = round_up(newsize, PAGE_SIZE); > > > + loff_t holebegin; > > > + > > > + /* if non-shmem file has thp, truncate the whole file */ > > > + if (filemap_nr_thps(mapping)) > > > + newsize = 0; > > > > > > > I don't get it. Sometimes truncate is used to > > increase the size of a file, or to change it > > to a non-zero size. > > > > Won't forcing the newsize to zero break applications, > > when the file is truncated to a different size than > > they expect? > > This is not truncate the file. It only drops page cache. > truncate_setsize() will still set correct size. I don't > think this breaks anything. Ahhh, indeed. Good point. I wonder if the dropping of the page cache could be done automatically from open(), if it determines that there are no more readonly THP users of the file, and the new opener wants to write to the file? That magic could be in a helper function, so it would be just a one line change in the same spot where it currently denies the permission :) > We can probably make it smarter and only drop the clean > huge pages (dirty page should not exist).
> On Jun 19, 2019, at 8:10 PM, Song Liu <songliubraving@fb.com> wrote: > > This is not truncate the file. It only drops page cache. > truncate_setsize() will still set correct size. I don't > think this breaks anything. > > We can probably make it smarter and only drop the clean > huge pages (dirty page should not exist). It sounds like I will need this change for my THP work as well for the same reason; once a RO THP text page is in the page cache, if the file is marked writable strange things will occur.
diff --git a/fs/inode.c b/fs/inode.c index df6542ec3b88..518113a4e219 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -181,6 +181,9 @@ int inode_init_always(struct super_block *sb, struct inode *inode) mapping->flags = 0; mapping->wb_err = 0; atomic_set(&mapping->i_mmap_writable, 0); +#ifdef CONFIG_READ_ONLY_THP_FOR_FS + atomic_set(&mapping->nr_thps, 0); +#endif mapping_set_gfp_mask(mapping, GFP_HIGHUSER_MOVABLE); mapping->private_data = NULL; mapping->writeback_index = 0; diff --git a/include/linux/fs.h b/include/linux/fs.h index f7fdfe93e25d..3edf4ee42eee 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -444,6 +444,10 @@ struct address_space { struct xarray i_pages; gfp_t gfp_mask; atomic_t i_mmap_writable; +#ifdef CONFIG_READ_ONLY_THP_FOR_FS + /* number of thp, only for non-shmem files */ + atomic_t nr_thps; +#endif struct rb_root_cached i_mmap; struct rw_semaphore i_mmap_rwsem; unsigned long nrpages; @@ -2790,6 +2794,33 @@ static inline errseq_t filemap_sample_wb_err(struct address_space *mapping) return errseq_sample(&mapping->wb_err); } +static inline int filemap_nr_thps(struct address_space *mapping) +{ +#ifdef CONFIG_READ_ONLY_THP_FOR_FS + return atomic_read(&mapping->nr_thps); +#else + return 0; +#endif +} + +static inline void filemap_nr_thps_inc(struct address_space *mapping) +{ +#ifdef CONFIG_READ_ONLY_THP_FOR_FS + atomic_inc(&mapping->nr_thps); +#else + WARN_ON_ONCE(1); +#endif +} + +static inline void filemap_nr_thps_dec(struct address_space *mapping) +{ +#ifdef CONFIG_READ_ONLY_THP_FOR_FS + atomic_dec(&mapping->nr_thps); +#else + WARN_ON_ONCE(1); +#endif +} + extern int vfs_fsync_range(struct file *file, loff_t start, loff_t end, int datasync); extern int vfs_fsync(struct file *file, int datasync); diff --git a/mm/filemap.c b/mm/filemap.c index e79ceccdc6df..a8e86c136381 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -205,6 +205,7 @@ static void unaccount_page_cache_page(struct address_space *mapping, __dec_node_page_state(page, NR_SHMEM_THPS); } else if (PageTransHuge(page)) { __dec_node_page_state(page, NR_FILE_THPS); + filemap_nr_thps_dec(mapping); } /* diff --git a/mm/khugepaged.c b/mm/khugepaged.c index fbcff5a1d65a..17ebe9da56ce 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1500,8 +1500,10 @@ static void collapse_file(struct vm_area_struct *vma, if (is_shmem) __inc_node_page_state(new_page, NR_SHMEM_THPS); - else + else { __inc_node_page_state(new_page, NR_FILE_THPS); + filemap_nr_thps_inc(mapping); + } if (nr_none) { struct zone *zone = page_zone(new_page); diff --git a/mm/truncate.c b/mm/truncate.c index 8563339041f6..bab8d9eef46c 100644 --- a/mm/truncate.c +++ b/mm/truncate.c @@ -790,7 +790,11 @@ EXPORT_SYMBOL_GPL(invalidate_inode_pages2); void truncate_pagecache(struct inode *inode, loff_t newsize) { struct address_space *mapping = inode->i_mapping; - loff_t holebegin = round_up(newsize, PAGE_SIZE); + loff_t holebegin; + + /* if non-shmem file has thp, truncate the whole file */ + if (filemap_nr_thps(mapping)) + newsize = 0; /* * unmap_mapping_range is called twice, first simply for @@ -801,6 +805,7 @@ void truncate_pagecache(struct inode *inode, loff_t newsize) * truncate_inode_pages finishes, hence the second * unmap_mapping_range call must be made for correctness. */ + holebegin = round_up(newsize, PAGE_SIZE); unmap_mapping_range(mapping, holebegin, 0, 1); truncate_inode_pages(mapping, newsize); unmap_mapping_range(mapping, holebegin, 0, 1);
In previous patch, an application could put part of its text section in THP via madvise(). These THPs will be protected from writes when the application is still running (TXTBSY). However, after the application exits, the file is available for writes. This patch briefly handles such writes by truncate the whole file in sys_open(). A new counter nr_thps is added to struct address_space. in truncate_pagecache(), if nr_thps is not zero, we force truncate of the whole file. Signed-off-by: Song Liu <songliubraving@fb.com> --- fs/inode.c | 3 +++ include/linux/fs.h | 31 +++++++++++++++++++++++++++++++ mm/filemap.c | 1 + mm/khugepaged.c | 4 +++- mm/truncate.c | 7 ++++++- 5 files changed, 44 insertions(+), 2 deletions(-)