Message ID | 20240718130212.23905-2-bfoster@redhat.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | iomap: zero dirty folios over unwritten mappings on zero range | expand |
On Thu, Jul 18, 2024 at 09:02:09AM -0400, Brian Foster wrote: > @@ -655,6 +655,8 @@ bool filemap_range_has_writeback(struct address_space *mapping, > folio_test_writeback(folio)) > break; > } > + if (folio) > + *start_byte = folio_pos(folio); > rcu_read_unlock(); > return folio != NULL; > } Distressingly, this is unsafe. We have no reference on the folio at this point (not one that matters, anyway). We have the rcu read lock, yes, but that doesn't protect enough to make folio_pos() safe. Since we do't have folio_get() here, the folio can be freed, sent back to the page allocator, and then reallocated to literally any purpose. As I'm reviewing patch 1/4, I have no idea if this is just a hint and you can survive it being completely wrong, or if this is going to cause problems.
On Thu, Jul 18, 2024 at 04:09:03PM +0100, Matthew Wilcox wrote: > On Thu, Jul 18, 2024 at 09:02:09AM -0400, Brian Foster wrote: > > @@ -655,6 +655,8 @@ bool filemap_range_has_writeback(struct address_space *mapping, > > folio_test_writeback(folio)) > > break; > > } > > + if (folio) > > + *start_byte = folio_pos(folio); > > rcu_read_unlock(); > > return folio != NULL; > > } > > Distressingly, this is unsafe. > > We have no reference on the folio at this point (not one that matters, > anyway). We have the rcu read lock, yes, but that doesn't protect enough > to make folio_pos() safe. > > Since we do't have folio_get() here, the folio can be freed, sent back to > the page allocator, and then reallocated to literally any purpose. As I'm > reviewing patch 1/4, I have no idea if this is just a hint and you can > survive it being completely wrong, or if this is going to cause problems. > Ah, thanks. I was unsure about this when I hacked it up but then got more focused on patch 3. I think for this implementation I'd want it to be an accurate pos of the first dirty/wb folio. I think this could possibly use filemap_range_has_writeback() (without patch 1) as more of a hint/optimization, but that might involve doing the FGP_NOCREAT thing from the previous variant of this prototype has and I was trying to avoid that. Do you think it would be reasonable to create a variant of this function that did the relevant bits from __filemap_get_folio(): if (!folio_try_get(folio)) goto repeat; if (unlikely(folio != xas_reload(&xas))) { folio_put(folio); goto repeat; } /* check dirty/wb etc. */ ... in order to either return a correct pos or maybe even a reference to the folio itself? iomap_zero_iter() wants the locked folio anyways, but that might be too ugly to pass through the iomap_folio_ops thing in current form. If that doesn't work, then I might chalk this up as another reason to just do the flush thing I was rambling about in the cover letter... Brian
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index a0a026d2d244..a15131a3fa12 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -1217,7 +1217,7 @@ int __filemap_add_folio(struct address_space *mapping, struct folio *folio, pgoff_t index, gfp_t gfp, void **shadowp); bool filemap_range_has_writeback(struct address_space *mapping, - loff_t start_byte, loff_t end_byte); + loff_t *start_byte, loff_t end_byte); /** * filemap_range_needs_writeback - check if range potentially needs writeback @@ -1242,7 +1242,7 @@ static inline bool filemap_range_needs_writeback(struct address_space *mapping, if (!mapping_tagged(mapping, PAGECACHE_TAG_DIRTY) && !mapping_tagged(mapping, PAGECACHE_TAG_WRITEBACK)) return false; - return filemap_range_has_writeback(mapping, start_byte, end_byte); + return filemap_range_has_writeback(mapping, &start_byte, end_byte); } /** diff --git a/mm/filemap.c b/mm/filemap.c index 657bcd887fdb..be0a219e8d9e 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -636,13 +636,13 @@ static bool mapping_needs_writeback(struct address_space *mapping) } bool filemap_range_has_writeback(struct address_space *mapping, - loff_t start_byte, loff_t end_byte) + loff_t *start_byte, loff_t end_byte) { - XA_STATE(xas, &mapping->i_pages, start_byte >> PAGE_SHIFT); + XA_STATE(xas, &mapping->i_pages, *start_byte >> PAGE_SHIFT); pgoff_t max = end_byte >> PAGE_SHIFT; struct folio *folio; - if (end_byte < start_byte) + if (end_byte < *start_byte) return false; rcu_read_lock(); @@ -655,6 +655,8 @@ bool filemap_range_has_writeback(struct address_space *mapping, folio_test_writeback(folio)) break; } + if (folio) + *start_byte = folio_pos(folio); rcu_read_unlock(); return folio != NULL; }
The iomap layer has a use case that wants to locate the first dirty or writeback folio in a particular range. filemap_range_has_writeback() already implements this with the exception that it only returns a boolean. Since the _needs_writeback() wrapper is currently the only caller, tweak has_writeback to take a pointer for the starting offset and update it to the offset of the first dirty folio found. Signed-off-by: Brian Foster <bfoster@redhat.com> --- include/linux/pagemap.h | 4 ++-- mm/filemap.c | 8 +++++--- 2 files changed, 7 insertions(+), 5 deletions(-)