mbox series

[0/1] Is pagecache_isize_extended() compatible with large folios?

Message ID 20240228182230.1401088-1-willy@infradead.org (mailing list archive)
Headers show
Series Is pagecache_isize_extended() compatible with large folios? | expand

Message

Matthew Wilcox Feb. 28, 2024, 6:22 p.m. UTC
I'd appreciate some filesystem people checking my work here (in that
pagecache_isize_extended() may already be broken and we didn't notice).

As far as I can tell (and it'd be nice to explain this in the kernel-doc
a little more thoroughly), the reason pagecache_isize_extended() exists
is that some filesystems rely on getting page_mkwrite() calls in order to
instantiate blocks.  So if you have a filesystem using 512 byte blocks and
a 256 byte file mmaped, a store anywhere in the page will only result in
block 0 of the file being instantiated and the folio will now be marked
as dirty.

If we ftruncate the file to 2500 bytes before the folio gets written back,
then store to offset 2000, the filesystem will not be notified, so it
will not instantiate a block to store that information in.  Therefore if
we truncate a file up, we need to mark the PTE that straddles the EOF
as read-only so that page_mkwrite() is called.

Now, I think this patch is safe because it's PAGE_SIZE that's important,
not the size of the folio.  We mmap files on PAGE_SIZE boundaries and
we're only asking if there could be a new store which causes a block
to be instantiated.  If the block size is >= PAGE_SIZE, there can't be.
If the folio size happens to be larger than PAGE_SIZE, it doesn't matter.
All that matters is that we protect the folio which crosses i_size if
block size < PAGE_SIZE.

Matthew Wilcox (Oracle) (1):
  mm: Convert pagecache_isize_extended to use a folio

 mm/truncate.c | 22 ++++++++++------------
 1 file changed, 10 insertions(+), 12 deletions(-)