Message ID | 20240515055719.32577-12-da.gomez@samsung.com (mailing list archive) |
---|---|
State | Accepted, archived |
Headers | show |
Series | [01/12] splice: don't check for uptodate if partially uptodate is impl | expand |
Hi Daniel,
kernel test robot noticed the following build warnings:
[auto build test WARNING on akpm-mm/mm-everything]
[also build test WARNING on xfs-linux/for-next brauner-vfs/vfs.all linus/master v6.9 next-20240515]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Daniel-Gomez/splice-don-t-check-for-uptodate-if-partially-uptodate-is-impl/20240515-135925
base: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link: https://lore.kernel.org/r/20240515055719.32577-12-da.gomez%40samsung.com
patch subject: [PATCH 11/12] shmem: add file length arg in shmem_get_folio() path
config: openrisc-defconfig (https://download.01.org/0day-ci/archive/20240516/202405160144.a9ad9CX5-lkp@intel.com/config)
compiler: or1k-linux-gcc (GCC) 13.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240516/202405160144.a9ad9CX5-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202405160144.a9ad9CX5-lkp@intel.com/
All warnings (new ones prefixed by >>):
>> mm/shmem.c:2382: warning: Function parameter or struct member 'len' not described in 'shmem_get_folio'
vim +2382 mm/shmem.c
^1da177e4c3f41 Linus Torvalds 2005-04-16 2356
d7468609ee0f90 Christoph Hellwig 2024-02-19 2357 /**
d7468609ee0f90 Christoph Hellwig 2024-02-19 2358 * shmem_get_folio - find, and lock a shmem folio.
d7468609ee0f90 Christoph Hellwig 2024-02-19 2359 * @inode: inode to search
d7468609ee0f90 Christoph Hellwig 2024-02-19 2360 * @index: the page index.
d7468609ee0f90 Christoph Hellwig 2024-02-19 2361 * @foliop: pointer to the folio if found
d7468609ee0f90 Christoph Hellwig 2024-02-19 2362 * @sgp: SGP_* flags to control behavior
d7468609ee0f90 Christoph Hellwig 2024-02-19 2363 *
d7468609ee0f90 Christoph Hellwig 2024-02-19 2364 * Looks up the page cache entry at @inode & @index. If a folio is
d7468609ee0f90 Christoph Hellwig 2024-02-19 2365 * present, it is returned locked with an increased refcount.
d7468609ee0f90 Christoph Hellwig 2024-02-19 2366 *
9d8b36744935f8 Christoph Hellwig 2024-02-19 2367 * If the caller modifies data in the folio, it must call folio_mark_dirty()
9d8b36744935f8 Christoph Hellwig 2024-02-19 2368 * before unlocking the folio to ensure that the folio is not reclaimed.
9d8b36744935f8 Christoph Hellwig 2024-02-19 2369 * There is no need to reserve space before calling folio_mark_dirty().
9d8b36744935f8 Christoph Hellwig 2024-02-19 2370 *
d7468609ee0f90 Christoph Hellwig 2024-02-19 2371 * When no folio is found, the behavior depends on @sgp:
8d4dd9d741c330 Akira Yokosawa 2024-02-27 2372 * - for SGP_READ, *@foliop is %NULL and 0 is returned
8d4dd9d741c330 Akira Yokosawa 2024-02-27 2373 * - for SGP_NOALLOC, *@foliop is %NULL and -ENOENT is returned
d7468609ee0f90 Christoph Hellwig 2024-02-19 2374 * - for all other flags a new folio is allocated, inserted into the
d7468609ee0f90 Christoph Hellwig 2024-02-19 2375 * page cache and returned locked in @foliop.
d7468609ee0f90 Christoph Hellwig 2024-02-19 2376 *
d7468609ee0f90 Christoph Hellwig 2024-02-19 2377 * Context: May sleep.
d7468609ee0f90 Christoph Hellwig 2024-02-19 2378 * Return: 0 if successful, else a negative error code.
d7468609ee0f90 Christoph Hellwig 2024-02-19 2379 */
4e1fc793ad9892 Matthew Wilcox (Oracle 2022-09-02 2380) int shmem_get_folio(struct inode *inode, pgoff_t index, struct folio **foliop,
02efe2fbe45ffd Daniel Gomez 2024-05-15 2381 enum sgp_type sgp, size_t len)
4e1fc793ad9892 Matthew Wilcox (Oracle 2022-09-02 @2382) {
4e1fc793ad9892 Matthew Wilcox (Oracle 2022-09-02 2383) return shmem_get_folio_gfp(inode, index, foliop, sgp,
02efe2fbe45ffd Daniel Gomez 2024-05-15 2384 mapping_gfp_mask(inode->i_mapping), NULL, NULL, len);
4e1fc793ad9892 Matthew Wilcox (Oracle 2022-09-02 2385) }
d7468609ee0f90 Christoph Hellwig 2024-02-19 2386 EXPORT_SYMBOL_GPL(shmem_get_folio);
4e1fc793ad9892 Matthew Wilcox (Oracle 2022-09-02 2387)
On Wed, May 15, 2024 at 05:57:36AM +0000, Daniel Gomez wrote: > In preparation for large folio in the write and fallocate paths, add > file length argument in shmem_get_folio() path to be able to calculate > the folio order based on the file size. Use of order-0 (PAGE_SIZE) for > read, page cache read, and vm fault. > > This enables high order folios in the write and fallocate path once the > folio order is calculated based on the length. > > Signed-off-by: Daniel Gomez <da.gomez@samsung.com> > --- > fs/xfs/scrub/xfile.c | 6 +++--- > fs/xfs/xfs_buf_mem.c | 3 ++- > include/linux/shmem_fs.h | 2 +- > mm/khugepaged.c | 3 ++- > mm/shmem.c | 35 ++++++++++++++++++++--------------- > mm/userfaultfd.c | 2 +- > 6 files changed, 29 insertions(+), 22 deletions(-) > > diff --git a/fs/xfs/scrub/xfile.c b/fs/xfs/scrub/xfile.c > index 8cdd863db585..4905f5e4cb5d 100644 > --- a/fs/xfs/scrub/xfile.c > +++ b/fs/xfs/scrub/xfile.c > @@ -127,7 +127,7 @@ xfile_load( > unsigned int offset; > > if (shmem_get_folio(inode, pos >> PAGE_SHIFT, &folio, > - SGP_READ) < 0) > + SGP_READ, PAGE_SIZE) < 0) I suppose I /did/ say during LSFMM that for the current users of xfile.c and xfs_buf_mem.c the order of the folio being returned doesn't really matter, but why wouldn't the last argument here be "roundup_64(count, PAGE_SIZE)" ? Shouldn't we at least hint to the page cache about the folio order that we actually want instead of limiting it to order-0? (Also it seems a little odd to me that the @index is in units of pgoff_t but @len is in bytes.) > break; > if (!folio) { > /* > @@ -197,7 +197,7 @@ xfile_store( > unsigned int offset; > > if (shmem_get_folio(inode, pos >> PAGE_SHIFT, &folio, > - SGP_CACHE) < 0) > + SGP_CACHE, PAGE_SIZE) < 0) > break; > if (filemap_check_wb_err(inode->i_mapping, 0)) { > folio_unlock(folio); > @@ -268,7 +268,7 @@ xfile_get_folio( > > pflags = memalloc_nofs_save(); > error = shmem_get_folio(inode, pos >> PAGE_SHIFT, &folio, > - (flags & XFILE_ALLOC) ? SGP_CACHE : SGP_READ); > + (flags & XFILE_ALLOC) ? SGP_CACHE : SGP_READ, PAGE_SIZE); > memalloc_nofs_restore(pflags); > if (error) > return ERR_PTR(error); > diff --git a/fs/xfs/xfs_buf_mem.c b/fs/xfs/xfs_buf_mem.c > index 9bb2d24de709..784c81d35a1f 100644 > --- a/fs/xfs/xfs_buf_mem.c > +++ b/fs/xfs/xfs_buf_mem.c > @@ -149,7 +149,8 @@ xmbuf_map_page( > return -ENOMEM; > } > > - error = shmem_get_folio(inode, pos >> PAGE_SHIFT, &folio, SGP_CACHE); > + error = shmem_get_folio(inode, pos >> PAGE_SHIFT, &folio, SGP_CACHE, > + PAGE_SIZE); This is ok unless someone wants to use a different XMBUF_BLOCKSIZE. --D > if (error) > return error; > > diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h > index 3fb18f7eb73e..bc59b4a00228 100644 > --- a/include/linux/shmem_fs.h > +++ b/include/linux/shmem_fs.h > @@ -142,7 +142,7 @@ enum sgp_type { > }; > > int shmem_get_folio(struct inode *inode, pgoff_t index, struct folio **foliop, > - enum sgp_type sgp); > + enum sgp_type sgp, size_t len); > struct folio *shmem_read_folio_gfp(struct address_space *mapping, > pgoff_t index, gfp_t gfp); > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > index 38830174608f..947770ded68c 100644 > --- a/mm/khugepaged.c > +++ b/mm/khugepaged.c > @@ -1863,7 +1863,8 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr, > xas_unlock_irq(&xas); > /* swap in or instantiate fallocated page */ > if (shmem_get_folio(mapping->host, index, > - &folio, SGP_NOALLOC)) { > + &folio, SGP_NOALLOC, > + PAGE_SIZE)) { > result = SCAN_FAIL; > goto xa_unlocked; > } > diff --git a/mm/shmem.c b/mm/shmem.c > index d531018ffece..fcd2c9befe19 100644 > --- a/mm/shmem.c > +++ b/mm/shmem.c > @@ -1134,7 +1134,7 @@ static struct folio *shmem_get_partial_folio(struct inode *inode, pgoff_t index) > * (although in some cases this is just a waste of time). > */ > folio = NULL; > - shmem_get_folio(inode, index, &folio, SGP_READ); > + shmem_get_folio(inode, index, &folio, SGP_READ, PAGE_SIZE); > return folio; > } > > @@ -1844,7 +1844,7 @@ static struct folio *shmem_alloc_folio(gfp_t gfp, struct shmem_inode_info *info, > > static struct folio *shmem_alloc_and_add_folio(gfp_t gfp, > struct inode *inode, pgoff_t index, > - struct mm_struct *fault_mm, bool huge) > + struct mm_struct *fault_mm, bool huge, size_t len) > { > struct address_space *mapping = inode->i_mapping; > struct shmem_inode_info *info = SHMEM_I(inode); > @@ -2173,7 +2173,7 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index, > */ > static int shmem_get_folio_gfp(struct inode *inode, pgoff_t index, > struct folio **foliop, enum sgp_type sgp, gfp_t gfp, > - struct vm_fault *vmf, vm_fault_t *fault_type) > + struct vm_fault *vmf, vm_fault_t *fault_type, size_t len) > { > struct vm_area_struct *vma = vmf ? vmf->vma : NULL; > struct mm_struct *fault_mm; > @@ -2258,7 +2258,7 @@ static int shmem_get_folio_gfp(struct inode *inode, pgoff_t index, > huge_gfp = vma_thp_gfp_mask(vma); > huge_gfp = limit_gfp_mask(huge_gfp, gfp); > folio = shmem_alloc_and_add_folio(huge_gfp, > - inode, index, fault_mm, true); > + inode, index, fault_mm, true, len); > if (!IS_ERR(folio)) { > count_vm_event(THP_FILE_ALLOC); > goto alloced; > @@ -2267,7 +2267,8 @@ static int shmem_get_folio_gfp(struct inode *inode, pgoff_t index, > goto repeat; > } > > - folio = shmem_alloc_and_add_folio(gfp, inode, index, fault_mm, false); > + folio = shmem_alloc_and_add_folio(gfp, inode, index, fault_mm, false, > + len); > if (IS_ERR(folio)) { > error = PTR_ERR(folio); > if (error == -EEXIST) > @@ -2377,10 +2378,10 @@ static int shmem_get_folio_gfp(struct inode *inode, pgoff_t index, > * Return: 0 if successful, else a negative error code. > */ > int shmem_get_folio(struct inode *inode, pgoff_t index, struct folio **foliop, > - enum sgp_type sgp) > + enum sgp_type sgp, size_t len) > { > return shmem_get_folio_gfp(inode, index, foliop, sgp, > - mapping_gfp_mask(inode->i_mapping), NULL, NULL); > + mapping_gfp_mask(inode->i_mapping), NULL, NULL, len); > } > EXPORT_SYMBOL_GPL(shmem_get_folio); > > @@ -2475,7 +2476,7 @@ static vm_fault_t shmem_fault(struct vm_fault *vmf) > > WARN_ON_ONCE(vmf->page != NULL); > err = shmem_get_folio_gfp(inode, vmf->pgoff, &folio, SGP_CACHE, > - gfp, vmf, &ret); > + gfp, vmf, &ret, PAGE_SIZE); > if (err) > return vmf_error(err); > if (folio) { > @@ -2954,6 +2955,9 @@ shmem_write_begin(struct file *file, struct address_space *mapping, > struct folio *folio; > int ret = 0; > > + if (!mapping_large_folio_support(mapping)) > + len = min_t(size_t, len, PAGE_SIZE - offset_in_page(pos)); > + > /* i_rwsem is held by caller */ > if (unlikely(info->seals & (F_SEAL_GROW | > F_SEAL_WRITE | F_SEAL_FUTURE_WRITE))) { > @@ -2963,7 +2967,7 @@ shmem_write_begin(struct file *file, struct address_space *mapping, > return -EPERM; > } > > - ret = shmem_get_folio(inode, index, &folio, SGP_WRITE); > + ret = shmem_get_folio(inode, index, &folio, SGP_WRITE, len); > if (ret) > return ret; > > @@ -3083,7 +3087,7 @@ static ssize_t shmem_file_read_iter(struct kiocb *iocb, struct iov_iter *to) > break; > } > > - error = shmem_get_folio(inode, index, &folio, SGP_READ); > + error = shmem_get_folio(inode, index, &folio, SGP_READ, PAGE_SIZE); > if (error) { > if (error == -EINVAL) > error = 0; > @@ -3260,7 +3264,7 @@ static ssize_t shmem_file_splice_read(struct file *in, loff_t *ppos, > break; > > error = shmem_get_folio(inode, *ppos / PAGE_SIZE, &folio, > - SGP_READ); > + SGP_READ, PAGE_SIZE); > if (error) { > if (error == -EINVAL) > error = 0; > @@ -3469,7 +3473,8 @@ static long shmem_fallocate(struct file *file, int mode, loff_t offset, > error = -ENOMEM; > else > error = shmem_get_folio(inode, index, &folio, > - SGP_FALLOC); > + SGP_FALLOC, > + (end - index) << PAGE_SHIFT); > if (error) { > info->fallocend = undo_fallocend; > /* Remove the !uptodate folios we added */ > @@ -3822,7 +3827,7 @@ static int shmem_symlink(struct mnt_idmap *idmap, struct inode *dir, > } else { > inode_nohighmem(inode); > inode->i_mapping->a_ops = &shmem_aops; > - error = shmem_get_folio(inode, 0, &folio, SGP_WRITE); > + error = shmem_get_folio(inode, 0, &folio, SGP_WRITE, PAGE_SIZE); > if (error) > goto out_remove_offset; > inode->i_op = &shmem_symlink_inode_operations; > @@ -3868,7 +3873,7 @@ static const char *shmem_get_link(struct dentry *dentry, struct inode *inode, > return ERR_PTR(-ECHILD); > } > } else { > - error = shmem_get_folio(inode, 0, &folio, SGP_READ); > + error = shmem_get_folio(inode, 0, &folio, SGP_READ, PAGE_SIZE); > if (error) > return ERR_PTR(error); > if (!folio) > @@ -5255,7 +5260,7 @@ struct folio *shmem_read_folio_gfp(struct address_space *mapping, > int error; > > error = shmem_get_folio_gfp(inode, index, &folio, SGP_CACHE, > - gfp, NULL, NULL); > + gfp, NULL, NULL, PAGE_SIZE); > if (error) > return ERR_PTR(error); > > diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c > index 3c3539c573e7..540a0c2d4325 100644 > --- a/mm/userfaultfd.c > +++ b/mm/userfaultfd.c > @@ -359,7 +359,7 @@ static int mfill_atomic_pte_continue(pmd_t *dst_pmd, > struct page *page; > int ret; > > - ret = shmem_get_folio(inode, pgoff, &folio, SGP_NOALLOC); > + ret = shmem_get_folio(inode, pgoff, &folio, SGP_NOALLOC, PAGE_SIZE); > /* Our caller expects us to return -EFAULT if we failed to find folio */ > if (ret == -ENOENT) > ret = -EFAULT; > -- > 2.43.0 >
On Fri, May 17, 2024 at 09:17:41AM -0700, Darrick J. Wong wrote: > On Wed, May 15, 2024 at 05:57:36AM +0000, Daniel Gomez wrote: > > In preparation for large folio in the write and fallocate paths, add > > file length argument in shmem_get_folio() path to be able to calculate > > the folio order based on the file size. Use of order-0 (PAGE_SIZE) for > > read, page cache read, and vm fault. > > > > This enables high order folios in the write and fallocate path once the > > folio order is calculated based on the length. > > > > Signed-off-by: Daniel Gomez <da.gomez@samsung.com> > > --- > > fs/xfs/scrub/xfile.c | 6 +++--- > > fs/xfs/xfs_buf_mem.c | 3 ++- > > include/linux/shmem_fs.h | 2 +- > > mm/khugepaged.c | 3 ++- > > mm/shmem.c | 35 ++++++++++++++++++++--------------- > > mm/userfaultfd.c | 2 +- > > 6 files changed, 29 insertions(+), 22 deletions(-) > > > > diff --git a/fs/xfs/scrub/xfile.c b/fs/xfs/scrub/xfile.c > > index 8cdd863db585..4905f5e4cb5d 100644 > > --- a/fs/xfs/scrub/xfile.c > > +++ b/fs/xfs/scrub/xfile.c > > @@ -127,7 +127,7 @@ xfile_load( > > unsigned int offset; > > > > if (shmem_get_folio(inode, pos >> PAGE_SHIFT, &folio, > > - SGP_READ) < 0) > > + SGP_READ, PAGE_SIZE) < 0) > > I suppose I /did/ say during LSFMM that for the current users of xfile.c > and xfs_buf_mem.c the order of the folio being returned doesn't really I not sure if I understood you well. Could you please elaborate on this? > matter, but why wouldn't the last argument here be "roundup_64(count, > PAGE_SIZE)" ? Shouldn't we at least hint to the page cache about the > folio order that we actually want instead of limiting it to order-0? For v2, I'll include your suggestions. I think we can also enable large folios in xfile_get_folio(), please check below: diff --git a/fs/xfs/scrub/xfile.c b/fs/xfs/scrub/xfile.c index 8cdd863db585..df8b495b4939 100644 --- a/fs/xfs/scrub/xfile.c +++ b/fs/xfs/scrub/xfile.c @@ -127,7 +127,7 @@ xfile_load( unsigned int offset; if (shmem_get_folio(inode, pos >> PAGE_SHIFT, &folio, - SGP_READ) < 0) + SGP_READ, roundup_64(count, PAGE_SIZE)) < 0) break; if (!folio) { /* @@ -197,7 +197,7 @@ xfile_store( unsigned int offset; if (shmem_get_folio(inode, pos >> PAGE_SHIFT, &folio, - SGP_CACHE) < 0) + SGP_CACHE, roundup_64(count, PAGE_SIZE)) < 0) break; if (filemap_check_wb_err(inode->i_mapping, 0)) { folio_unlock(folio); @@ -268,7 +268,8 @@ xfile_get_folio( pflags = memalloc_nofs_save(); error = shmem_get_folio(inode, pos >> PAGE_SHIFT, &folio, - (flags & XFILE_ALLOC) ? SGP_CACHE : SGP_READ); + (flags & XFILE_ALLOC) ? SGP_CACHE : SGP_READ, + roundup_64(i_size_read(inode), PAGE_SIZE)); memalloc_nofs_restore(pflags); if (error) return ERR_PTR(error); > > (Also it seems a little odd to me that the @index is in units of pgoff_t > but @len is in bytes.) I extended the shmem_get_folio() with @len to calculate folio order based on size (bytes). This is sent to ilog2() although I'm planning to use get_order() instead (after fixing the issues mentioned during the discussion). @index is used for __ffs() (same as in filemap). Would you use lofft for @len instead? Or what's your suggestion? Thanks, Daniel > > > break; > > if (!folio) { > > /* > > @@ -197,7 +197,7 @@ xfile_store( > > unsigned int offset; > > > > if (shmem_get_folio(inode, pos >> PAGE_SHIFT, &folio, > > - SGP_CACHE) < 0) > > + SGP_CACHE, PAGE_SIZE) < 0) > > break; > > if (filemap_check_wb_err(inode->i_mapping, 0)) { > > folio_unlock(folio); > > @@ -268,7 +268,7 @@ xfile_get_folio( > > > > pflags = memalloc_nofs_save(); > > error = shmem_get_folio(inode, pos >> PAGE_SHIFT, &folio, > > - (flags & XFILE_ALLOC) ? SGP_CACHE : SGP_READ); > > + (flags & XFILE_ALLOC) ? SGP_CACHE : SGP_READ, PAGE_SIZE); > > memalloc_nofs_restore(pflags); > > if (error) > > return ERR_PTR(error); > > diff --git a/fs/xfs/xfs_buf_mem.c b/fs/xfs/xfs_buf_mem.c > > index 9bb2d24de709..784c81d35a1f 100644 > > --- a/fs/xfs/xfs_buf_mem.c > > +++ b/fs/xfs/xfs_buf_mem.c > > @@ -149,7 +149,8 @@ xmbuf_map_page( > > return -ENOMEM; > > } > > > > - error = shmem_get_folio(inode, pos >> PAGE_SHIFT, &folio, SGP_CACHE); > > + error = shmem_get_folio(inode, pos >> PAGE_SHIFT, &folio, SGP_CACHE, > > + PAGE_SIZE); > > This is ok unless someone wants to use a different XMBUF_BLOCKSIZE. > > --D > > > if (error) > > return error; > > > > diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h > > index 3fb18f7eb73e..bc59b4a00228 100644 > > --- a/include/linux/shmem_fs.h > > +++ b/include/linux/shmem_fs.h > > @@ -142,7 +142,7 @@ enum sgp_type { > > }; > > > > int shmem_get_folio(struct inode *inode, pgoff_t index, struct folio **foliop, > > - enum sgp_type sgp); > > + enum sgp_type sgp, size_t len); > > struct folio *shmem_read_folio_gfp(struct address_space *mapping, > > pgoff_t index, gfp_t gfp); > > > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > > index 38830174608f..947770ded68c 100644 > > --- a/mm/khugepaged.c > > +++ b/mm/khugepaged.c > > @@ -1863,7 +1863,8 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr, > > xas_unlock_irq(&xas); > > /* swap in or instantiate fallocated page */ > > if (shmem_get_folio(mapping->host, index, > > - &folio, SGP_NOALLOC)) { > > + &folio, SGP_NOALLOC, > > + PAGE_SIZE)) { > > result = SCAN_FAIL; > > goto xa_unlocked; > > } > > diff --git a/mm/shmem.c b/mm/shmem.c > > index d531018ffece..fcd2c9befe19 100644 > > --- a/mm/shmem.c > > +++ b/mm/shmem.c > > @@ -1134,7 +1134,7 @@ static struct folio *shmem_get_partial_folio(struct inode *inode, pgoff_t index) > > * (although in some cases this is just a waste of time). > > */ > > folio = NULL; > > - shmem_get_folio(inode, index, &folio, SGP_READ); > > + shmem_get_folio(inode, index, &folio, SGP_READ, PAGE_SIZE); > > return folio; > > } > > > > @@ -1844,7 +1844,7 @@ static struct folio *shmem_alloc_folio(gfp_t gfp, struct shmem_inode_info *info, > > > > static struct folio *shmem_alloc_and_add_folio(gfp_t gfp, > > struct inode *inode, pgoff_t index, > > - struct mm_struct *fault_mm, bool huge) > > + struct mm_struct *fault_mm, bool huge, size_t len) > > { > > struct address_space *mapping = inode->i_mapping; > > struct shmem_inode_info *info = SHMEM_I(inode); > > @@ -2173,7 +2173,7 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index, > > */ > > static int shmem_get_folio_gfp(struct inode *inode, pgoff_t index, > > struct folio **foliop, enum sgp_type sgp, gfp_t gfp, > > - struct vm_fault *vmf, vm_fault_t *fault_type) > > + struct vm_fault *vmf, vm_fault_t *fault_type, size_t len) > > { > > struct vm_area_struct *vma = vmf ? vmf->vma : NULL; > > struct mm_struct *fault_mm; > > @@ -2258,7 +2258,7 @@ static int shmem_get_folio_gfp(struct inode *inode, pgoff_t index, > > huge_gfp = vma_thp_gfp_mask(vma); > > huge_gfp = limit_gfp_mask(huge_gfp, gfp); > > folio = shmem_alloc_and_add_folio(huge_gfp, > > - inode, index, fault_mm, true); > > + inode, index, fault_mm, true, len); > > if (!IS_ERR(folio)) { > > count_vm_event(THP_FILE_ALLOC); > > goto alloced; > > @@ -2267,7 +2267,8 @@ static int shmem_get_folio_gfp(struct inode *inode, pgoff_t index, > > goto repeat; > > } > > > > - folio = shmem_alloc_and_add_folio(gfp, inode, index, fault_mm, false); > > + folio = shmem_alloc_and_add_folio(gfp, inode, index, fault_mm, false, > > + len); > > if (IS_ERR(folio)) { > > error = PTR_ERR(folio); > > if (error == -EEXIST) > > @@ -2377,10 +2378,10 @@ static int shmem_get_folio_gfp(struct inode *inode, pgoff_t index, > > * Return: 0 if successful, else a negative error code. > > */ > > int shmem_get_folio(struct inode *inode, pgoff_t index, struct folio **foliop, > > - enum sgp_type sgp) > > + enum sgp_type sgp, size_t len) > > { > > return shmem_get_folio_gfp(inode, index, foliop, sgp, > > - mapping_gfp_mask(inode->i_mapping), NULL, NULL); > > + mapping_gfp_mask(inode->i_mapping), NULL, NULL, len); > > } > > EXPORT_SYMBOL_GPL(shmem_get_folio); > > > > @@ -2475,7 +2476,7 @@ static vm_fault_t shmem_fault(struct vm_fault *vmf) > > > > WARN_ON_ONCE(vmf->page != NULL); > > err = shmem_get_folio_gfp(inode, vmf->pgoff, &folio, SGP_CACHE, > > - gfp, vmf, &ret); > > + gfp, vmf, &ret, PAGE_SIZE); > > if (err) > > return vmf_error(err); > > if (folio) { > > @@ -2954,6 +2955,9 @@ shmem_write_begin(struct file *file, struct address_space *mapping, > > struct folio *folio; > > int ret = 0; > > > > + if (!mapping_large_folio_support(mapping)) > > + len = min_t(size_t, len, PAGE_SIZE - offset_in_page(pos)); > > + > > /* i_rwsem is held by caller */ > > if (unlikely(info->seals & (F_SEAL_GROW | > > F_SEAL_WRITE | F_SEAL_FUTURE_WRITE))) { > > @@ -2963,7 +2967,7 @@ shmem_write_begin(struct file *file, struct address_space *mapping, > > return -EPERM; > > } > > > > - ret = shmem_get_folio(inode, index, &folio, SGP_WRITE); > > + ret = shmem_get_folio(inode, index, &folio, SGP_WRITE, len); > > if (ret) > > return ret; > > > > @@ -3083,7 +3087,7 @@ static ssize_t shmem_file_read_iter(struct kiocb *iocb, struct iov_iter *to) > > break; > > } > > > > - error = shmem_get_folio(inode, index, &folio, SGP_READ); > > + error = shmem_get_folio(inode, index, &folio, SGP_READ, PAGE_SIZE); > > if (error) { > > if (error == -EINVAL) > > error = 0; > > @@ -3260,7 +3264,7 @@ static ssize_t shmem_file_splice_read(struct file *in, loff_t *ppos, > > break; > > > > error = shmem_get_folio(inode, *ppos / PAGE_SIZE, &folio, > > - SGP_READ); > > + SGP_READ, PAGE_SIZE); > > if (error) { > > if (error == -EINVAL) > > error = 0; > > @@ -3469,7 +3473,8 @@ static long shmem_fallocate(struct file *file, int mode, loff_t offset, > > error = -ENOMEM; > > else > > error = shmem_get_folio(inode, index, &folio, > > - SGP_FALLOC); > > + SGP_FALLOC, > > + (end - index) << PAGE_SHIFT); > > if (error) { > > info->fallocend = undo_fallocend; > > /* Remove the !uptodate folios we added */ > > @@ -3822,7 +3827,7 @@ static int shmem_symlink(struct mnt_idmap *idmap, struct inode *dir, > > } else { > > inode_nohighmem(inode); > > inode->i_mapping->a_ops = &shmem_aops; > > - error = shmem_get_folio(inode, 0, &folio, SGP_WRITE); > > + error = shmem_get_folio(inode, 0, &folio, SGP_WRITE, PAGE_SIZE); > > if (error) > > goto out_remove_offset; > > inode->i_op = &shmem_symlink_inode_operations; > > @@ -3868,7 +3873,7 @@ static const char *shmem_get_link(struct dentry *dentry, struct inode *inode, > > return ERR_PTR(-ECHILD); > > } > > } else { > > - error = shmem_get_folio(inode, 0, &folio, SGP_READ); > > + error = shmem_get_folio(inode, 0, &folio, SGP_READ, PAGE_SIZE); > > if (error) > > return ERR_PTR(error); > > if (!folio) > > @@ -5255,7 +5260,7 @@ struct folio *shmem_read_folio_gfp(struct address_space *mapping, > > int error; > > > > error = shmem_get_folio_gfp(inode, index, &folio, SGP_CACHE, > > - gfp, NULL, NULL); > > + gfp, NULL, NULL, PAGE_SIZE); > > if (error) > > return ERR_PTR(error); > > > > diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c > > index 3c3539c573e7..540a0c2d4325 100644 > > --- a/mm/userfaultfd.c > > +++ b/mm/userfaultfd.c > > @@ -359,7 +359,7 @@ static int mfill_atomic_pte_continue(pmd_t *dst_pmd, > > struct page *page; > > int ret; > > > > - ret = shmem_get_folio(inode, pgoff, &folio, SGP_NOALLOC); > > + ret = shmem_get_folio(inode, pgoff, &folio, SGP_NOALLOC, PAGE_SIZE); > > /* Our caller expects us to return -EFAULT if we failed to find folio */ > > if (ret == -ENOENT) > > ret = -EFAULT; > > -- > > 2.43.0 > >
On Tue, May 21, 2024 at 11:38:33AM +0000, Daniel Gomez wrote: > On Fri, May 17, 2024 at 09:17:41AM -0700, Darrick J. Wong wrote: > > On Wed, May 15, 2024 at 05:57:36AM +0000, Daniel Gomez wrote: > > > In preparation for large folio in the write and fallocate paths, add > > > file length argument in shmem_get_folio() path to be able to calculate > > > the folio order based on the file size. Use of order-0 (PAGE_SIZE) for > > > read, page cache read, and vm fault. > > > > > > This enables high order folios in the write and fallocate path once the > > > folio order is calculated based on the length. > > > > > > Signed-off-by: Daniel Gomez <da.gomez@samsung.com> > > > --- > > > fs/xfs/scrub/xfile.c | 6 +++--- > > > fs/xfs/xfs_buf_mem.c | 3 ++- > > > include/linux/shmem_fs.h | 2 +- > > > mm/khugepaged.c | 3 ++- > > > mm/shmem.c | 35 ++++++++++++++++++++--------------- > > > mm/userfaultfd.c | 2 +- > > > 6 files changed, 29 insertions(+), 22 deletions(-) > > > > > > diff --git a/fs/xfs/scrub/xfile.c b/fs/xfs/scrub/xfile.c > > > index 8cdd863db585..4905f5e4cb5d 100644 > > > --- a/fs/xfs/scrub/xfile.c > > > +++ b/fs/xfs/scrub/xfile.c > > > @@ -127,7 +127,7 @@ xfile_load( > > > unsigned int offset; > > > > > > if (shmem_get_folio(inode, pos >> PAGE_SHIFT, &folio, > > > - SGP_READ) < 0) > > > + SGP_READ, PAGE_SIZE) < 0) > > > > I suppose I /did/ say during LSFMM that for the current users of xfile.c > > and xfs_buf_mem.c the order of the folio being returned doesn't really > I not sure if I understood you well. Could you please elaborate on this? Yes, I'll restate what I said in the session last week for those who weren't there: Currently, xfile.c and xfs_buf_mem.c are only used by online repair to stage a recordset while rebuilding an ondisk btree index. IOWs, they're ephemeral, so we don't care or need to optimize folio sizing. Some day they might be adapted for longer-term usage though, so we might as well try not to leave too many papercuts. xfs_buf_mem.c creates in-memory btrees that mimic the ondisk btrees, albeit with blocksize == PAGE_SIZE, regardless of the fs blocksize. For this case we probably aren't ever going to care about large folios. xfile.c is currently used to store fixed-size recordsets, names for rebuilding directories, and name/value pairs for rebuilding xattr structures. Records aren't allowed to be larger than PAGE_SIZE, names cannot be larger than MAXNAMELEN (255), and xattr values can't be larger than 64k. For that last case maybe it might be nice to get a large folio to reduce processing overhead, but huge xattrs aren't that common. > > matter, but why wouldn't the last argument here be "roundup_64(count, > > PAGE_SIZE)" ? Shouldn't we at least hint to the page cache about the > > folio order that we actually want instead of limiting it to order-0? > > For v2, I'll include your suggestions. I think we can also enable large folios > in xfile_get_folio(), please check below: > > diff --git a/fs/xfs/scrub/xfile.c b/fs/xfs/scrub/xfile.c > index 8cdd863db585..df8b495b4939 100644 > --- a/fs/xfs/scrub/xfile.c > +++ b/fs/xfs/scrub/xfile.c > @@ -127,7 +127,7 @@ xfile_load( > unsigned int offset; > > if (shmem_get_folio(inode, pos >> PAGE_SHIFT, &folio, > - SGP_READ) < 0) > + SGP_READ, roundup_64(count, PAGE_SIZE)) < 0) > break; > if (!folio) { > /* > @@ -197,7 +197,7 @@ xfile_store( > unsigned int offset; > > if (shmem_get_folio(inode, pos >> PAGE_SHIFT, &folio, > - SGP_CACHE) < 0) > + SGP_CACHE, roundup_64(count, PAGE_SIZE)) < 0) > break; > if (filemap_check_wb_err(inode->i_mapping, 0)) { > folio_unlock(folio); > @@ -268,7 +268,8 @@ xfile_get_folio( > > pflags = memalloc_nofs_save(); > error = shmem_get_folio(inode, pos >> PAGE_SHIFT, &folio, > - (flags & XFILE_ALLOC) ? SGP_CACHE : SGP_READ); > + (flags & XFILE_ALLOC) ? SGP_CACHE : SGP_READ, > + roundup_64(i_size_read(inode), PAGE_SIZE)); I'm not sure why you picked i_size_read here; the xfile could be several gigabytes long. xfile_get_folio want to look at a subset of the xfile, not all of it. roundup_64(len, PAGE_SIZE) perhaps? Also, should the rounding be done inside the shmem code so that callers don't have to know about that detail? > memalloc_nofs_restore(pflags); > if (error) > return ERR_PTR(error); > > > > > (Also it seems a little odd to me that the @index is in units of pgoff_t > > but @len is in bytes.) > > I extended the shmem_get_folio() with @len to calculate folio order based on > size (bytes). This is sent to ilog2() although I'm planning to use get_order() > instead (after fixing the issues mentioned during the discussion). @index is > used for __ffs() (same as in filemap). > > Would you use lofft for @len instead? Or what's your suggestion? I was reacting to @index, not @len. I might've shifted @index to "loff_t pos" but looking at the existing callsites it doesn't seem worth the churn. --D > Thanks, > Daniel > > > > > > break; > > > if (!folio) { > > > /* > > > @@ -197,7 +197,7 @@ xfile_store( > > > unsigned int offset; > > > > > > if (shmem_get_folio(inode, pos >> PAGE_SHIFT, &folio, > > > - SGP_CACHE) < 0) > > > + SGP_CACHE, PAGE_SIZE) < 0) > > > break; > > > if (filemap_check_wb_err(inode->i_mapping, 0)) { > > > folio_unlock(folio); > > > @@ -268,7 +268,7 @@ xfile_get_folio( > > > > > > pflags = memalloc_nofs_save(); > > > error = shmem_get_folio(inode, pos >> PAGE_SHIFT, &folio, > > > - (flags & XFILE_ALLOC) ? SGP_CACHE : SGP_READ); > > > + (flags & XFILE_ALLOC) ? SGP_CACHE : SGP_READ, PAGE_SIZE); > > > memalloc_nofs_restore(pflags); > > > if (error) > > > return ERR_PTR(error); > > > diff --git a/fs/xfs/xfs_buf_mem.c b/fs/xfs/xfs_buf_mem.c > > > index 9bb2d24de709..784c81d35a1f 100644 > > > --- a/fs/xfs/xfs_buf_mem.c > > > +++ b/fs/xfs/xfs_buf_mem.c > > > @@ -149,7 +149,8 @@ xmbuf_map_page( > > > return -ENOMEM; > > > } > > > > > > - error = shmem_get_folio(inode, pos >> PAGE_SHIFT, &folio, SGP_CACHE); > > > + error = shmem_get_folio(inode, pos >> PAGE_SHIFT, &folio, SGP_CACHE, > > > + PAGE_SIZE); > > > > This is ok unless someone wants to use a different XMBUF_BLOCKSIZE. > > > > --D > > > > > if (error) > > > return error; > > > > > > diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h > > > index 3fb18f7eb73e..bc59b4a00228 100644 > > > --- a/include/linux/shmem_fs.h > > > +++ b/include/linux/shmem_fs.h > > > @@ -142,7 +142,7 @@ enum sgp_type { > > > }; > > > > > > int shmem_get_folio(struct inode *inode, pgoff_t index, struct folio **foliop, > > > - enum sgp_type sgp); > > > + enum sgp_type sgp, size_t len); > > > struct folio *shmem_read_folio_gfp(struct address_space *mapping, > > > pgoff_t index, gfp_t gfp); > > > > > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > > > index 38830174608f..947770ded68c 100644 > > > --- a/mm/khugepaged.c > > > +++ b/mm/khugepaged.c > > > @@ -1863,7 +1863,8 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr, > > > xas_unlock_irq(&xas); > > > /* swap in or instantiate fallocated page */ > > > if (shmem_get_folio(mapping->host, index, > > > - &folio, SGP_NOALLOC)) { > > > + &folio, SGP_NOALLOC, > > > + PAGE_SIZE)) { > > > result = SCAN_FAIL; > > > goto xa_unlocked; > > > } > > > diff --git a/mm/shmem.c b/mm/shmem.c > > > index d531018ffece..fcd2c9befe19 100644 > > > --- a/mm/shmem.c > > > +++ b/mm/shmem.c > > > @@ -1134,7 +1134,7 @@ static struct folio *shmem_get_partial_folio(struct inode *inode, pgoff_t index) > > > * (although in some cases this is just a waste of time). > > > */ > > > folio = NULL; > > > - shmem_get_folio(inode, index, &folio, SGP_READ); > > > + shmem_get_folio(inode, index, &folio, SGP_READ, PAGE_SIZE); > > > return folio; > > > } > > > > > > @@ -1844,7 +1844,7 @@ static struct folio *shmem_alloc_folio(gfp_t gfp, struct shmem_inode_info *info, > > > > > > static struct folio *shmem_alloc_and_add_folio(gfp_t gfp, > > > struct inode *inode, pgoff_t index, > > > - struct mm_struct *fault_mm, bool huge) > > > + struct mm_struct *fault_mm, bool huge, size_t len) > > > { > > > struct address_space *mapping = inode->i_mapping; > > > struct shmem_inode_info *info = SHMEM_I(inode); > > > @@ -2173,7 +2173,7 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index, > > > */ > > > static int shmem_get_folio_gfp(struct inode *inode, pgoff_t index, > > > struct folio **foliop, enum sgp_type sgp, gfp_t gfp, > > > - struct vm_fault *vmf, vm_fault_t *fault_type) > > > + struct vm_fault *vmf, vm_fault_t *fault_type, size_t len) > > > { > > > struct vm_area_struct *vma = vmf ? vmf->vma : NULL; > > > struct mm_struct *fault_mm; > > > @@ -2258,7 +2258,7 @@ static int shmem_get_folio_gfp(struct inode *inode, pgoff_t index, > > > huge_gfp = vma_thp_gfp_mask(vma); > > > huge_gfp = limit_gfp_mask(huge_gfp, gfp); > > > folio = shmem_alloc_and_add_folio(huge_gfp, > > > - inode, index, fault_mm, true); > > > + inode, index, fault_mm, true, len); > > > if (!IS_ERR(folio)) { > > > count_vm_event(THP_FILE_ALLOC); > > > goto alloced; > > > @@ -2267,7 +2267,8 @@ static int shmem_get_folio_gfp(struct inode *inode, pgoff_t index, > > > goto repeat; > > > } > > > > > > - folio = shmem_alloc_and_add_folio(gfp, inode, index, fault_mm, false); > > > + folio = shmem_alloc_and_add_folio(gfp, inode, index, fault_mm, false, > > > + len); > > > if (IS_ERR(folio)) { > > > error = PTR_ERR(folio); > > > if (error == -EEXIST) > > > @@ -2377,10 +2378,10 @@ static int shmem_get_folio_gfp(struct inode *inode, pgoff_t index, > > > * Return: 0 if successful, else a negative error code. > > > */ > > > int shmem_get_folio(struct inode *inode, pgoff_t index, struct folio **foliop, > > > - enum sgp_type sgp) > > > + enum sgp_type sgp, size_t len) > > > { > > > return shmem_get_folio_gfp(inode, index, foliop, sgp, > > > - mapping_gfp_mask(inode->i_mapping), NULL, NULL); > > > + mapping_gfp_mask(inode->i_mapping), NULL, NULL, len); > > > } > > > EXPORT_SYMBOL_GPL(shmem_get_folio); > > > > > > @@ -2475,7 +2476,7 @@ static vm_fault_t shmem_fault(struct vm_fault *vmf) > > > > > > WARN_ON_ONCE(vmf->page != NULL); > > > err = shmem_get_folio_gfp(inode, vmf->pgoff, &folio, SGP_CACHE, > > > - gfp, vmf, &ret); > > > + gfp, vmf, &ret, PAGE_SIZE); > > > if (err) > > > return vmf_error(err); > > > if (folio) { > > > @@ -2954,6 +2955,9 @@ shmem_write_begin(struct file *file, struct address_space *mapping, > > > struct folio *folio; > > > int ret = 0; > > > > > > + if (!mapping_large_folio_support(mapping)) > > > + len = min_t(size_t, len, PAGE_SIZE - offset_in_page(pos)); > > > + > > > /* i_rwsem is held by caller */ > > > if (unlikely(info->seals & (F_SEAL_GROW | > > > F_SEAL_WRITE | F_SEAL_FUTURE_WRITE))) { > > > @@ -2963,7 +2967,7 @@ shmem_write_begin(struct file *file, struct address_space *mapping, > > > return -EPERM; > > > } > > > > > > - ret = shmem_get_folio(inode, index, &folio, SGP_WRITE); > > > + ret = shmem_get_folio(inode, index, &folio, SGP_WRITE, len); > > > if (ret) > > > return ret; > > > > > > @@ -3083,7 +3087,7 @@ static ssize_t shmem_file_read_iter(struct kiocb *iocb, struct iov_iter *to) > > > break; > > > } > > > > > > - error = shmem_get_folio(inode, index, &folio, SGP_READ); > > > + error = shmem_get_folio(inode, index, &folio, SGP_READ, PAGE_SIZE); > > > if (error) { > > > if (error == -EINVAL) > > > error = 0; > > > @@ -3260,7 +3264,7 @@ static ssize_t shmem_file_splice_read(struct file *in, loff_t *ppos, > > > break; > > > > > > error = shmem_get_folio(inode, *ppos / PAGE_SIZE, &folio, > > > - SGP_READ); > > > + SGP_READ, PAGE_SIZE); > > > if (error) { > > > if (error == -EINVAL) > > > error = 0; > > > @@ -3469,7 +3473,8 @@ static long shmem_fallocate(struct file *file, int mode, loff_t offset, > > > error = -ENOMEM; > > > else > > > error = shmem_get_folio(inode, index, &folio, > > > - SGP_FALLOC); > > > + SGP_FALLOC, > > > + (end - index) << PAGE_SHIFT); > > > if (error) { > > > info->fallocend = undo_fallocend; > > > /* Remove the !uptodate folios we added */ > > > @@ -3822,7 +3827,7 @@ static int shmem_symlink(struct mnt_idmap *idmap, struct inode *dir, > > > } else { > > > inode_nohighmem(inode); > > > inode->i_mapping->a_ops = &shmem_aops; > > > - error = shmem_get_folio(inode, 0, &folio, SGP_WRITE); > > > + error = shmem_get_folio(inode, 0, &folio, SGP_WRITE, PAGE_SIZE); > > > if (error) > > > goto out_remove_offset; > > > inode->i_op = &shmem_symlink_inode_operations; > > > @@ -3868,7 +3873,7 @@ static const char *shmem_get_link(struct dentry *dentry, struct inode *inode, > > > return ERR_PTR(-ECHILD); > > > } > > > } else { > > > - error = shmem_get_folio(inode, 0, &folio, SGP_READ); > > > + error = shmem_get_folio(inode, 0, &folio, SGP_READ, PAGE_SIZE); > > > if (error) > > > return ERR_PTR(error); > > > if (!folio) > > > @@ -5255,7 +5260,7 @@ struct folio *shmem_read_folio_gfp(struct address_space *mapping, > > > int error; > > > > > > error = shmem_get_folio_gfp(inode, index, &folio, SGP_CACHE, > > > - gfp, NULL, NULL); > > > + gfp, NULL, NULL, PAGE_SIZE); > > > if (error) > > > return ERR_PTR(error); > > > > > > diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c > > > index 3c3539c573e7..540a0c2d4325 100644 > > > --- a/mm/userfaultfd.c > > > +++ b/mm/userfaultfd.c > > > @@ -359,7 +359,7 @@ static int mfill_atomic_pte_continue(pmd_t *dst_pmd, > > > struct page *page; > > > int ret; > > > > > > - ret = shmem_get_folio(inode, pgoff, &folio, SGP_NOALLOC); > > > + ret = shmem_get_folio(inode, pgoff, &folio, SGP_NOALLOC, PAGE_SIZE); > > > /* Our caller expects us to return -EFAULT if we failed to find folio */ > > > if (ret == -ENOENT) > > > ret = -EFAULT; > > > -- > > > 2.43.0 > > >
diff --git a/fs/xfs/scrub/xfile.c b/fs/xfs/scrub/xfile.c index 8cdd863db585..4905f5e4cb5d 100644 --- a/fs/xfs/scrub/xfile.c +++ b/fs/xfs/scrub/xfile.c @@ -127,7 +127,7 @@ xfile_load( unsigned int offset; if (shmem_get_folio(inode, pos >> PAGE_SHIFT, &folio, - SGP_READ) < 0) + SGP_READ, PAGE_SIZE) < 0) break; if (!folio) { /* @@ -197,7 +197,7 @@ xfile_store( unsigned int offset; if (shmem_get_folio(inode, pos >> PAGE_SHIFT, &folio, - SGP_CACHE) < 0) + SGP_CACHE, PAGE_SIZE) < 0) break; if (filemap_check_wb_err(inode->i_mapping, 0)) { folio_unlock(folio); @@ -268,7 +268,7 @@ xfile_get_folio( pflags = memalloc_nofs_save(); error = shmem_get_folio(inode, pos >> PAGE_SHIFT, &folio, - (flags & XFILE_ALLOC) ? SGP_CACHE : SGP_READ); + (flags & XFILE_ALLOC) ? SGP_CACHE : SGP_READ, PAGE_SIZE); memalloc_nofs_restore(pflags); if (error) return ERR_PTR(error); diff --git a/fs/xfs/xfs_buf_mem.c b/fs/xfs/xfs_buf_mem.c index 9bb2d24de709..784c81d35a1f 100644 --- a/fs/xfs/xfs_buf_mem.c +++ b/fs/xfs/xfs_buf_mem.c @@ -149,7 +149,8 @@ xmbuf_map_page( return -ENOMEM; } - error = shmem_get_folio(inode, pos >> PAGE_SHIFT, &folio, SGP_CACHE); + error = shmem_get_folio(inode, pos >> PAGE_SHIFT, &folio, SGP_CACHE, + PAGE_SIZE); if (error) return error; diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h index 3fb18f7eb73e..bc59b4a00228 100644 --- a/include/linux/shmem_fs.h +++ b/include/linux/shmem_fs.h @@ -142,7 +142,7 @@ enum sgp_type { }; int shmem_get_folio(struct inode *inode, pgoff_t index, struct folio **foliop, - enum sgp_type sgp); + enum sgp_type sgp, size_t len); struct folio *shmem_read_folio_gfp(struct address_space *mapping, pgoff_t index, gfp_t gfp); diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 38830174608f..947770ded68c 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1863,7 +1863,8 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr, xas_unlock_irq(&xas); /* swap in or instantiate fallocated page */ if (shmem_get_folio(mapping->host, index, - &folio, SGP_NOALLOC)) { + &folio, SGP_NOALLOC, + PAGE_SIZE)) { result = SCAN_FAIL; goto xa_unlocked; } diff --git a/mm/shmem.c b/mm/shmem.c index d531018ffece..fcd2c9befe19 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -1134,7 +1134,7 @@ static struct folio *shmem_get_partial_folio(struct inode *inode, pgoff_t index) * (although in some cases this is just a waste of time). */ folio = NULL; - shmem_get_folio(inode, index, &folio, SGP_READ); + shmem_get_folio(inode, index, &folio, SGP_READ, PAGE_SIZE); return folio; } @@ -1844,7 +1844,7 @@ static struct folio *shmem_alloc_folio(gfp_t gfp, struct shmem_inode_info *info, static struct folio *shmem_alloc_and_add_folio(gfp_t gfp, struct inode *inode, pgoff_t index, - struct mm_struct *fault_mm, bool huge) + struct mm_struct *fault_mm, bool huge, size_t len) { struct address_space *mapping = inode->i_mapping; struct shmem_inode_info *info = SHMEM_I(inode); @@ -2173,7 +2173,7 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index, */ static int shmem_get_folio_gfp(struct inode *inode, pgoff_t index, struct folio **foliop, enum sgp_type sgp, gfp_t gfp, - struct vm_fault *vmf, vm_fault_t *fault_type) + struct vm_fault *vmf, vm_fault_t *fault_type, size_t len) { struct vm_area_struct *vma = vmf ? vmf->vma : NULL; struct mm_struct *fault_mm; @@ -2258,7 +2258,7 @@ static int shmem_get_folio_gfp(struct inode *inode, pgoff_t index, huge_gfp = vma_thp_gfp_mask(vma); huge_gfp = limit_gfp_mask(huge_gfp, gfp); folio = shmem_alloc_and_add_folio(huge_gfp, - inode, index, fault_mm, true); + inode, index, fault_mm, true, len); if (!IS_ERR(folio)) { count_vm_event(THP_FILE_ALLOC); goto alloced; @@ -2267,7 +2267,8 @@ static int shmem_get_folio_gfp(struct inode *inode, pgoff_t index, goto repeat; } - folio = shmem_alloc_and_add_folio(gfp, inode, index, fault_mm, false); + folio = shmem_alloc_and_add_folio(gfp, inode, index, fault_mm, false, + len); if (IS_ERR(folio)) { error = PTR_ERR(folio); if (error == -EEXIST) @@ -2377,10 +2378,10 @@ static int shmem_get_folio_gfp(struct inode *inode, pgoff_t index, * Return: 0 if successful, else a negative error code. */ int shmem_get_folio(struct inode *inode, pgoff_t index, struct folio **foliop, - enum sgp_type sgp) + enum sgp_type sgp, size_t len) { return shmem_get_folio_gfp(inode, index, foliop, sgp, - mapping_gfp_mask(inode->i_mapping), NULL, NULL); + mapping_gfp_mask(inode->i_mapping), NULL, NULL, len); } EXPORT_SYMBOL_GPL(shmem_get_folio); @@ -2475,7 +2476,7 @@ static vm_fault_t shmem_fault(struct vm_fault *vmf) WARN_ON_ONCE(vmf->page != NULL); err = shmem_get_folio_gfp(inode, vmf->pgoff, &folio, SGP_CACHE, - gfp, vmf, &ret); + gfp, vmf, &ret, PAGE_SIZE); if (err) return vmf_error(err); if (folio) { @@ -2954,6 +2955,9 @@ shmem_write_begin(struct file *file, struct address_space *mapping, struct folio *folio; int ret = 0; + if (!mapping_large_folio_support(mapping)) + len = min_t(size_t, len, PAGE_SIZE - offset_in_page(pos)); + /* i_rwsem is held by caller */ if (unlikely(info->seals & (F_SEAL_GROW | F_SEAL_WRITE | F_SEAL_FUTURE_WRITE))) { @@ -2963,7 +2967,7 @@ shmem_write_begin(struct file *file, struct address_space *mapping, return -EPERM; } - ret = shmem_get_folio(inode, index, &folio, SGP_WRITE); + ret = shmem_get_folio(inode, index, &folio, SGP_WRITE, len); if (ret) return ret; @@ -3083,7 +3087,7 @@ static ssize_t shmem_file_read_iter(struct kiocb *iocb, struct iov_iter *to) break; } - error = shmem_get_folio(inode, index, &folio, SGP_READ); + error = shmem_get_folio(inode, index, &folio, SGP_READ, PAGE_SIZE); if (error) { if (error == -EINVAL) error = 0; @@ -3260,7 +3264,7 @@ static ssize_t shmem_file_splice_read(struct file *in, loff_t *ppos, break; error = shmem_get_folio(inode, *ppos / PAGE_SIZE, &folio, - SGP_READ); + SGP_READ, PAGE_SIZE); if (error) { if (error == -EINVAL) error = 0; @@ -3469,7 +3473,8 @@ static long shmem_fallocate(struct file *file, int mode, loff_t offset, error = -ENOMEM; else error = shmem_get_folio(inode, index, &folio, - SGP_FALLOC); + SGP_FALLOC, + (end - index) << PAGE_SHIFT); if (error) { info->fallocend = undo_fallocend; /* Remove the !uptodate folios we added */ @@ -3822,7 +3827,7 @@ static int shmem_symlink(struct mnt_idmap *idmap, struct inode *dir, } else { inode_nohighmem(inode); inode->i_mapping->a_ops = &shmem_aops; - error = shmem_get_folio(inode, 0, &folio, SGP_WRITE); + error = shmem_get_folio(inode, 0, &folio, SGP_WRITE, PAGE_SIZE); if (error) goto out_remove_offset; inode->i_op = &shmem_symlink_inode_operations; @@ -3868,7 +3873,7 @@ static const char *shmem_get_link(struct dentry *dentry, struct inode *inode, return ERR_PTR(-ECHILD); } } else { - error = shmem_get_folio(inode, 0, &folio, SGP_READ); + error = shmem_get_folio(inode, 0, &folio, SGP_READ, PAGE_SIZE); if (error) return ERR_PTR(error); if (!folio) @@ -5255,7 +5260,7 @@ struct folio *shmem_read_folio_gfp(struct address_space *mapping, int error; error = shmem_get_folio_gfp(inode, index, &folio, SGP_CACHE, - gfp, NULL, NULL); + gfp, NULL, NULL, PAGE_SIZE); if (error) return ERR_PTR(error); diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 3c3539c573e7..540a0c2d4325 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -359,7 +359,7 @@ static int mfill_atomic_pte_continue(pmd_t *dst_pmd, struct page *page; int ret; - ret = shmem_get_folio(inode, pgoff, &folio, SGP_NOALLOC); + ret = shmem_get_folio(inode, pgoff, &folio, SGP_NOALLOC, PAGE_SIZE); /* Our caller expects us to return -EFAULT if we failed to find folio */ if (ret == -ENOENT) ret = -EFAULT;
In preparation for large folio in the write and fallocate paths, add file length argument in shmem_get_folio() path to be able to calculate the folio order based on the file size. Use of order-0 (PAGE_SIZE) for read, page cache read, and vm fault. This enables high order folios in the write and fallocate path once the folio order is calculated based on the length. Signed-off-by: Daniel Gomez <da.gomez@samsung.com> --- fs/xfs/scrub/xfile.c | 6 +++--- fs/xfs/xfs_buf_mem.c | 3 ++- include/linux/shmem_fs.h | 2 +- mm/khugepaged.c | 3 ++- mm/shmem.c | 35 ++++++++++++++++++++--------------- mm/userfaultfd.c | 2 +- 6 files changed, 29 insertions(+), 22 deletions(-)