Message ID | 20230214075710.2401855-1-stevensd@google.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [1/2] mm/khugepaged: set THP as uptodate earlier for shmem | expand |
On Tue, Feb 14, 2023 at 04:57:09PM +0900, David Stevens wrote: > /* > - * At this point the hpage is locked and not up-to-date. > - * It's safe to insert it into the page cache, because nobody would > - * be able to map it or use it in another way until we unlock it. > + * Mark hpage as up-to-date before inserting it into the page cache to > + * prevent it from being mistaken for an fallocated but unwritten page. > + * Inserting the unfinished hpage into the page cache is safe because > + * it is locked, so nobody can map it or use it in another way until we > + * unlock it. No, that's not true. The data has to be there before we mark it uptodate. See filemap_get_pages() for example, used as part of read(). We don't lock the page unless we need to bring it uptodate ourselves.
On Wed, Feb 15, 2023 at 12:44 AM Matthew Wilcox <willy@infradead.org> wrote: > > On Tue, Feb 14, 2023 at 04:57:09PM +0900, David Stevens wrote: > > /* > > - * At this point the hpage is locked and not up-to-date. > > - * It's safe to insert it into the page cache, because nobody would > > - * be able to map it or use it in another way until we unlock it. > > + * Mark hpage as up-to-date before inserting it into the page cache to > > + * prevent it from being mistaken for an fallocated but unwritten page. > > + * Inserting the unfinished hpage into the page cache is safe because > > + * it is locked, so nobody can map it or use it in another way until we > > + * unlock it. > > No, that's not true. The data has to be there before we mark it > uptodate. See filemap_get_pages() for example, used as part of > read(). We don't lock the page unless we need to bring it uptodate > ourselves. I've been focusing on the shmem case for collapse_file and forgot to think about the !is_shmem case. As far as I could tell, shmem doesn't use filemap_get_pages() and everything else in filemap.c/shmem.c that checks folio_test_uptodate also locks the folio. But yeah, this would break the !is_shmem case and is kind of sketchy anyway. I'll put together a better patch. -David
On Wed, Feb 15, 2023 at 10:33:15AM +0900, David Stevens wrote: > On Wed, Feb 15, 2023 at 12:44 AM Matthew Wilcox <willy@infradead.org> wrote: > > > > On Tue, Feb 14, 2023 at 04:57:09PM +0900, David Stevens wrote: > > > /* > > > - * At this point the hpage is locked and not up-to-date. > > > - * It's safe to insert it into the page cache, because nobody would > > > - * be able to map it or use it in another way until we unlock it. > > > + * Mark hpage as up-to-date before inserting it into the page cache to > > > + * prevent it from being mistaken for an fallocated but unwritten page. > > > + * Inserting the unfinished hpage into the page cache is safe because > > > + * it is locked, so nobody can map it or use it in another way until we > > > + * unlock it. > > > > No, that's not true. The data has to be there before we mark it > > uptodate. See filemap_get_pages() for example, used as part of > > read(). We don't lock the page unless we need to bring it uptodate > > ourselves. > > I've been focusing on the shmem case for collapse_file and forgot to > think about the !is_shmem case. As far as I could tell, shmem doesn't > use filemap_get_pages() and everything else in filemap.c/shmem.c that > checks folio_test_uptodate also locks the folio. But yeah, this would > break the !is_shmem case and is kind of sketchy anyway. I'll put > together a better patch. AFAIU we lock the page iff !uptodate and we want to wait it to be uptodate, or as Matthew said when we want to modify !uptodate->uptodate. Take the same example of folio_seek_hole_data() that you mentioned: if (xa_is_value(folio) || folio_test_uptodate(folio)) return seek_data ? start : end;
diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 79be13133322..b648f1053d95 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1779,10 +1779,13 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr, hpage->mapping = mapping; /* - * At this point the hpage is locked and not up-to-date. - * It's safe to insert it into the page cache, because nobody would - * be able to map it or use it in another way until we unlock it. + * Mark hpage as up-to-date before inserting it into the page cache to + * prevent it from being mistaken for an fallocated but unwritten page. + * Inserting the unfinished hpage into the page cache is safe because + * it is locked, so nobody can map it or use it in another way until we + * unlock it. */ + SetPageUptodate(hpage); xas_set(&xas, start); for (index = start; index < end; index++) {