diff mbox series

[1/2] mm/khugepaged: set THP as uptodate earlier for shmem

Message ID 20230214075710.2401855-1-stevensd@google.com (mailing list archive)
State New
Headers show
Series [1/2] mm/khugepaged: set THP as uptodate earlier for shmem | expand

Commit Message

David Stevens Feb. 14, 2023, 7:57 a.m. UTC
From: David Stevens <stevensd@chromium.org>

In collapse_file, mark the THP as up-to-date before inserting it into
the page cache. This fixes a race where folio_seek_hole_data would
mistake the THP for an fallocated but unwritten page. This race is
visible to userspace via data temporarily disappearing from
SEEK_DATA/SEEK_HOLE, which can cause data loss for applications that use
lseek to efficiently snapshot sparse shmem.

Fixes: f3f0e1d2150b ("khugepaged: add support of collapse for tmpfs/shmem pages")
Signed-off-by: David Stevens <stevensd@chromium.org>
---
 mm/khugepaged.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

Comments

Matthew Wilcox (Oracle) Feb. 14, 2023, 3:44 p.m. UTC | #1
On Tue, Feb 14, 2023 at 04:57:09PM +0900, David Stevens wrote:
>  	/*
> -	 * At this point the hpage is locked and not up-to-date.
> -	 * It's safe to insert it into the page cache, because nobody would
> -	 * be able to map it or use it in another way until we unlock it.
> +	 * Mark hpage as up-to-date before inserting it into the page cache to
> +	 * prevent it from being mistaken for an fallocated but unwritten page.
> +	 * Inserting the unfinished hpage into the page cache is safe because
> +	 * it is locked, so nobody can map it or use it in another way until we
> +	 * unlock it.

No, that's not true.  The data has to be there before we mark it
uptodate.  See filemap_get_pages() for example, used as part of
read().  We don't lock the page unless we need to bring it uptodate
ourselves.
David Stevens Feb. 15, 2023, 1:33 a.m. UTC | #2
On Wed, Feb 15, 2023 at 12:44 AM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Tue, Feb 14, 2023 at 04:57:09PM +0900, David Stevens wrote:
> >       /*
> > -      * At this point the hpage is locked and not up-to-date.
> > -      * It's safe to insert it into the page cache, because nobody would
> > -      * be able to map it or use it in another way until we unlock it.
> > +      * Mark hpage as up-to-date before inserting it into the page cache to
> > +      * prevent it from being mistaken for an fallocated but unwritten page.
> > +      * Inserting the unfinished hpage into the page cache is safe because
> > +      * it is locked, so nobody can map it or use it in another way until we
> > +      * unlock it.
>
> No, that's not true.  The data has to be there before we mark it
> uptodate.  See filemap_get_pages() for example, used as part of
> read().  We don't lock the page unless we need to bring it uptodate
> ourselves.

I've been focusing on the shmem case for collapse_file and forgot to
think about the !is_shmem case. As far as I could tell, shmem doesn't
use filemap_get_pages() and everything else in filemap.c/shmem.c that
checks folio_test_uptodate also locks the folio. But yeah, this would
break the !is_shmem case and is kind of sketchy anyway. I'll put
together a better patch.

-David
Peter Xu Feb. 15, 2023, 10:05 p.m. UTC | #3
On Wed, Feb 15, 2023 at 10:33:15AM +0900, David Stevens wrote:
> On Wed, Feb 15, 2023 at 12:44 AM Matthew Wilcox <willy@infradead.org> wrote:
> >
> > On Tue, Feb 14, 2023 at 04:57:09PM +0900, David Stevens wrote:
> > >       /*
> > > -      * At this point the hpage is locked and not up-to-date.
> > > -      * It's safe to insert it into the page cache, because nobody would
> > > -      * be able to map it or use it in another way until we unlock it.
> > > +      * Mark hpage as up-to-date before inserting it into the page cache to
> > > +      * prevent it from being mistaken for an fallocated but unwritten page.
> > > +      * Inserting the unfinished hpage into the page cache is safe because
> > > +      * it is locked, so nobody can map it or use it in another way until we
> > > +      * unlock it.
> >
> > No, that's not true.  The data has to be there before we mark it
> > uptodate.  See filemap_get_pages() for example, used as part of
> > read().  We don't lock the page unless we need to bring it uptodate
> > ourselves.
> 
> I've been focusing on the shmem case for collapse_file and forgot to
> think about the !is_shmem case. As far as I could tell, shmem doesn't
> use filemap_get_pages() and everything else in filemap.c/shmem.c that
> checks folio_test_uptodate also locks the folio. But yeah, this would
> break the !is_shmem case and is kind of sketchy anyway. I'll put
> together a better patch.

AFAIU we lock the page iff !uptodate and we want to wait it to be uptodate,
or as Matthew said when we want to modify !uptodate->uptodate.

Take the same example of folio_seek_hole_data() that you mentioned:

	if (xa_is_value(folio) || folio_test_uptodate(folio))
		return seek_data ? start : end;
diff mbox series

Patch

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 79be13133322..b648f1053d95 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1779,10 +1779,13 @@  static int collapse_file(struct mm_struct *mm, unsigned long addr,
 	hpage->mapping = mapping;
 
 	/*
-	 * At this point the hpage is locked and not up-to-date.
-	 * It's safe to insert it into the page cache, because nobody would
-	 * be able to map it or use it in another way until we unlock it.
+	 * Mark hpage as up-to-date before inserting it into the page cache to
+	 * prevent it from being mistaken for an fallocated but unwritten page.
+	 * Inserting the unfinished hpage into the page cache is safe because
+	 * it is locked, so nobody can map it or use it in another way until we
+	 * unlock it.
 	 */
+	SetPageUptodate(hpage);
 
 	xas_set(&xas, start);
 	for (index = start; index < end; index++) {