Message ID | 20241129125303.4033164-1-david@redhat.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v1] mm/filemap: don't call folio_test_locked() without a reference in next_uptodate_folio() | expand |
On Fri, Nov 29, 2024 at 01:53:03PM +0100, David Hildenbrand wrote: > The folio can get freed + buddy-merged + reallocated in the meantime, > resulting in us calling folio_test_locked() possibly on a tail page. > > This makes const_folio_flags VM_BUG_ON_PGFLAGS() when stumbling over > the tail page. > > Could this result in other issues? Doesn't look like it. False positives > and false negatives don't really matter, because this folio would get > skipped either way when detecting that they have been reallocated in > the meantime. > > Fix it by performing the folio_test_locked() checked after grabbing a > reference. If this ever becomes a real problem, we could add a special > helper that racily checks if the bit is set even on tail pages ... but > let's hope that's not required so we can just handle it cleaner: > work on the folio after we hold a reference. > > Do we really need the folio_test_locked() check if we are going to > trylock briefly after? Well, we can at least avoid a xas_reload(). > > It's a bit unclear which exact change introduced that issue. Likely, > ever since we made PG_locked obey to the PF_NO_TAIL policy it could have > been triggered in some way. > > Reported-by: syzbot+9f9a7f73fb079b2387a6@syzkaller.appspotmail.com > Closes: https://lore.kernel.org/lkml/674184c9.050a0220.1cc393.0001.GAE@google.com/ > Fixes: 48c935ad88f5 ("page-flags: define PG_locked behavior on compound pages") > Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> > Cc: Hillf Danton <hdanton@sina.com> > Signed-off-by: David Hildenbrand <david@redhat.com> Looks reasonable: Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
diff --git a/mm/filemap.c b/mm/filemap.c index 7c76a123ba18b..f61cf51c22389 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -3501,10 +3501,10 @@ static struct folio *next_uptodate_folio(struct xa_state *xas, continue; if (xa_is_value(folio)) continue; - if (folio_test_locked(folio)) - continue; if (!folio_try_get(folio)) continue; + if (folio_test_locked(folio)) + goto skip; /* Has the page moved or been split? */ if (unlikely(folio != xas_reload(xas))) goto skip;
The folio can get freed + buddy-merged + reallocated in the meantime, resulting in us calling folio_test_locked() possibly on a tail page. This makes const_folio_flags VM_BUG_ON_PGFLAGS() when stumbling over the tail page. Could this result in other issues? Doesn't look like it. False positives and false negatives don't really matter, because this folio would get skipped either way when detecting that they have been reallocated in the meantime. Fix it by performing the folio_test_locked() checked after grabbing a reference. If this ever becomes a real problem, we could add a special helper that racily checks if the bit is set even on tail pages ... but let's hope that's not required so we can just handle it cleaner: work on the folio after we hold a reference. Do we really need the folio_test_locked() check if we are going to trylock briefly after? Well, we can at least avoid a xas_reload(). It's a bit unclear which exact change introduced that issue. Likely, ever since we made PG_locked obey to the PF_NO_TAIL policy it could have been triggered in some way. Reported-by: syzbot+9f9a7f73fb079b2387a6@syzkaller.appspotmail.com Closes: https://lore.kernel.org/lkml/674184c9.050a0220.1cc393.0001.GAE@google.com/ Fixes: 48c935ad88f5 ("page-flags: define PG_locked behavior on compound pages") Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Hillf Danton <hdanton@sina.com> Signed-off-by: David Hildenbrand <david@redhat.com> --- mm/filemap.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)