Message ID | 20190312201742.22935-1-josef@toxicpanda.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | filemap: don't unlock null page in FGP_FOR_MMAP case | expand |
On Tue, 12 Mar 2019 16:17:42 -0400 Josef Bacik <josef@toxicpanda.com> wrote: > We noticed a panic happening in production with the filemap fault pages > because we were unlocking a NULL page. If add_to_page_cache() fails > then we'll have a NULL page, so fix this check to only unlock if we > have a valid page. > > Signed-off-by: Josef Bacik <josef@toxicpanda.com> > --- > mm/filemap.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/mm/filemap.c b/mm/filemap.c > index cace3eb8069f..2815cb79a246 100644 > --- a/mm/filemap.c > +++ b/mm/filemap.c > @@ -1663,7 +1663,7 @@ struct page *pagecache_get_page(struct address_space *mapping, pgoff_t offset, > * add_to_page_cache_lru locks the page, and for mmap we expect > * an unlocked page. > */ > - if (fgp_flags & FGP_FOR_MMAP) > + if (page && (fgp_flags & FGP_FOR_MMAP)) > unlock_page(page); > } > Fixes "filemap: kill page_cache_read usage in filemap_fault". This patch series: filemap-kill-page_cache_read-usage-in-filemap_fault.patch filemap-kill-page_cache_read-usage-in-filemap_fault-fix.patch filemap-kill-page_cache_read-usage-in-filemap_fault-fix-2.patch filemap-pass-vm_fault-to-the-mmap-ra-helpers.patch filemap-drop-the-mmap_sem-for-all-blocking-operations.patch filemap-drop-the-mmap_sem-for-all-blocking-operations-v6.patch filemap-drop-the-mmap_sem-for-all-blocking-operations-fix.patch filemap-drop-the-mmap_sem-for-all-blocking-operations-checkpatch-fixes.patch has been stuck since December. I have a note here that syzbot reported a use-after-free. What's the situation with that? I also have a cryptic note that filemap-drop-the-mmap_sem-for-all-blocking-operations-v6.patch is "still fishy". I'm not sure what I meant by the latter - the (small amount of) review seems to be OK. Do you recall what issues there might have been and the status of those?
On Tue, 2019-03-12 at 16:17 -0400, Josef Bacik wrote: > We noticed a panic happening in production with the filemap fault > pages > because we were unlocking a NULL page. If add_to_page_cache() fails > then we'll have a NULL page, so fix this check to only unlock if we > have a valid page. > > Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Rik van Riel <riel@surriel.com>
On Tue, Mar 12, 2019 at 02:06:23PM -0700, Andrew Morton wrote: > On Tue, 12 Mar 2019 16:17:42 -0400 Josef Bacik <josef@toxicpanda.com> wrote: > > > We noticed a panic happening in production with the filemap fault pages > > because we were unlocking a NULL page. If add_to_page_cache() fails > > then we'll have a NULL page, so fix this check to only unlock if we > > have a valid page. > > > > Signed-off-by: Josef Bacik <josef@toxicpanda.com> > > --- > > mm/filemap.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/mm/filemap.c b/mm/filemap.c > > index cace3eb8069f..2815cb79a246 100644 > > --- a/mm/filemap.c > > +++ b/mm/filemap.c > > @@ -1663,7 +1663,7 @@ struct page *pagecache_get_page(struct address_space *mapping, pgoff_t offset, > > * add_to_page_cache_lru locks the page, and for mmap we expect > > * an unlocked page. > > */ > > - if (fgp_flags & FGP_FOR_MMAP) > > + if (page && (fgp_flags & FGP_FOR_MMAP)) > > unlock_page(page); > > } > > > > Fixes "filemap: kill page_cache_read usage in filemap_fault". > > This patch series: > > filemap-kill-page_cache_read-usage-in-filemap_fault.patch > filemap-kill-page_cache_read-usage-in-filemap_fault-fix.patch > filemap-kill-page_cache_read-usage-in-filemap_fault-fix-2.patch > filemap-pass-vm_fault-to-the-mmap-ra-helpers.patch > filemap-drop-the-mmap_sem-for-all-blocking-operations.patch > filemap-drop-the-mmap_sem-for-all-blocking-operations-v6.patch > filemap-drop-the-mmap_sem-for-all-blocking-operations-fix.patch > filemap-drop-the-mmap_sem-for-all-blocking-operations-checkpatch-fixes.patch > > has been stuck since December. I have a note here that syzbot reported > a use-after-free. What's the situation with that? > Yup that was fixed by filemap-drop-the-mmap_sem-for-all-blocking-operations-fix.patch so we're good there. > I also have a cryptic note that > filemap-drop-the-mmap_sem-for-all-blocking-operations-v6.patch is > "still fishy". I'm not sure what I meant by the latter - the (small > amount of) review seems to be OK. Do you recall what issues there > might have been and the status of those? Looking back at the discussion I _think_ the "still fishy" thing was you were concerned that now if we can't get a page in do_async_mmap_readahead and we dropped the mmap sem we'd return VM_FAULT_RETRY instead of -ENOMEM. Jan pointed out that we have to do this as we've dropped the mmap_sem and it's the only safe thing to return so we're ok there. If that's not it then I'm not sure why you were still concerned with it. For what its worth these patches have been in production since December, we only noticed this panic on a small set of hosts that still have ext4 so by-in-large they've been stable. Thanks, Josef
diff --git a/mm/filemap.c b/mm/filemap.c index cace3eb8069f..2815cb79a246 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -1663,7 +1663,7 @@ struct page *pagecache_get_page(struct address_space *mapping, pgoff_t offset, * add_to_page_cache_lru locks the page, and for mmap we expect * an unlocked page. */ - if (fgp_flags & FGP_FOR_MMAP) + if (page && (fgp_flags & FGP_FOR_MMAP)) unlock_page(page); }
We noticed a panic happening in production with the filemap fault pages because we were unlocking a NULL page. If add_to_page_cache() fails then we'll have a NULL page, so fix this check to only unlock if we have a valid page. Signed-off-by: Josef Bacik <josef@toxicpanda.com> --- mm/filemap.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)