Message ID | 20211120174429.2596303-1-willy@infradead.org (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | filemap: Remove PageHWPoison check from next_uptodate_page() | expand |
On Sat, Nov 20, 2021 at 05:44:29PM +0000, Matthew Wilcox (Oracle) wrote: > Pages are individually marked as suffering from hardware poisoning. > Checking that the head page is not hardware poisoned doesn't make > sense; we might be after a subpage. We check each page individually > before we use it, so this was an optimisation gone wrong. > > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Looks good to me. Thank you. Reviewed-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
On Sat, Nov 20, 2021 at 9:44 AM Matthew Wilcox (Oracle) <willy@infradead.org> wrote: > > Pages are individually marked as suffering from hardware poisoning. > Checking that the head page is not hardware poisoned doesn't make > sense; we might be after a subpage. We check each page individually > before we use it, so this was an optimisation gone wrong. Yeah, it doesn't make too much sense to check the head page. And it seems the non-poisoned subpages could be PTE mapped instead of skipping the whole THP. Not sure if this is by design, it seems the hwpoisoned check in filemap_map_pages() does skip the subpages after the poisoned page. Or we should just skip the poisoned page itself? If so the below change may be needed: diff --git a/mm/filemap.c b/mm/filemap.c index daa0e23a6ee6..f1f0cb263b4a 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -3318,7 +3318,7 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf, do { page = find_subpage(head, xas.xa_index); if (PageHWPoison(page)) - goto unlock; + goto skip; if (mmap_miss > 0) mmap_miss--; @@ -3337,6 +3337,7 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf, do_set_pte(vmf, page, addr); /* no need to invalidate: a not-present page won't be cached */ update_mmu_cache(vma, addr, vmf->pte); +skip: unlock_page(head); continue; unlock: > > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> > --- > mm/filemap.c | 2 -- > 1 file changed, 2 deletions(-) > > diff --git a/mm/filemap.c b/mm/filemap.c > index 0b6f996108b4..65973204112d 100644 > --- a/mm/filemap.c > +++ b/mm/filemap.c > @@ -3239,8 +3239,6 @@ static struct page *next_uptodate_page(struct page *page, > goto skip; > if (!PageUptodate(page) || PageReadahead(page)) > goto skip; > - if (PageHWPoison(page)) > - goto skip; > if (!trylock_page(page)) > goto skip; > if (page->mapping != mapping) > -- > 2.33.0 >
On Mon, Nov 22, 2021 at 11:28:10AM -0800, Yang Shi wrote: > On Sat, Nov 20, 2021 at 9:44 AM Matthew Wilcox (Oracle) > <willy@infradead.org> wrote: > > > > Pages are individually marked as suffering from hardware poisoning. > > Checking that the head page is not hardware poisoned doesn't make > > sense; we might be after a subpage. We check each page individually > > before we use it, so this was an optimisation gone wrong. > > Yeah, it doesn't make too much sense to check the head page. And it > seems the non-poisoned subpages could be PTE mapped instead of > skipping the whole THP. > > Not sure if this is by design, it seems the hwpoisoned check in > filemap_map_pages() does skip the subpages after the poisoned page. Or > we should just skip the poisoned page itself? If so the below change > may be needed: > > diff --git a/mm/filemap.c b/mm/filemap.c > index daa0e23a6ee6..f1f0cb263b4a 100644 > --- a/mm/filemap.c > +++ b/mm/filemap.c > @@ -3318,7 +3318,7 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf, > do { > page = find_subpage(head, xas.xa_index); > if (PageHWPoison(page)) > - goto unlock; > + goto skip; > > if (mmap_miss > 0) > mmap_miss--; > @@ -3337,6 +3337,7 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf, > do_set_pte(vmf, page, addr); > /* no need to invalidate: a not-present page won't be cached */ > update_mmu_cache(vma, addr, vmf->pte); > +skip: > unlock_page(head); > continue; > unlock: first_map_page() or next_map_page() returns a page (if found) with holding the refcount, and the new 'goto skip' path skips releasing it. So this looks to me lead to the mismatch of refcount. Could you explain the intention a little more (maybe related to your recent patch about keeping hwpoison page in pagecache?) ? Thanks, Naoya Horiguchi > > > > > > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> > > --- > > mm/filemap.c | 2 -- > > 1 file changed, 2 deletions(-) > > > > diff --git a/mm/filemap.c b/mm/filemap.c > > index 0b6f996108b4..65973204112d 100644 > > --- a/mm/filemap.c > > +++ b/mm/filemap.c > > @@ -3239,8 +3239,6 @@ static struct page *next_uptodate_page(struct page *page, > > goto skip; > > if (!PageUptodate(page) || PageReadahead(page)) > > goto skip; > > - if (PageHWPoison(page)) > > - goto skip; > > if (!trylock_page(page)) > > goto skip; > > if (page->mapping != mapping) > > -- > > 2.33.0 > >
On Tue, Nov 23, 2021 at 4:11 PM HORIGUCHI NAOYA(堀口 直也) <naoya.horiguchi@nec.com> wrote: > > On Mon, Nov 22, 2021 at 11:28:10AM -0800, Yang Shi wrote: > > On Sat, Nov 20, 2021 at 9:44 AM Matthew Wilcox (Oracle) > > <willy@infradead.org> wrote: > > > > > > Pages are individually marked as suffering from hardware poisoning. > > > Checking that the head page is not hardware poisoned doesn't make > > > sense; we might be after a subpage. We check each page individually > > > before we use it, so this was an optimisation gone wrong. > > > > Yeah, it doesn't make too much sense to check the head page. And it > > seems the non-poisoned subpages could be PTE mapped instead of > > skipping the whole THP. > > > > Not sure if this is by design, it seems the hwpoisoned check in > > filemap_map_pages() does skip the subpages after the poisoned page. Or > > we should just skip the poisoned page itself? If so the below change > > may be needed: > > > > diff --git a/mm/filemap.c b/mm/filemap.c > > index daa0e23a6ee6..f1f0cb263b4a 100644 > > --- a/mm/filemap.c > > +++ b/mm/filemap.c > > @@ -3318,7 +3318,7 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf, > > do { > > page = find_subpage(head, xas.xa_index); > > if (PageHWPoison(page)) > > - goto unlock; > > + goto skip; > > > > if (mmap_miss > 0) > > mmap_miss--; > > @@ -3337,6 +3337,7 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf, > > do_set_pte(vmf, page, addr); > > /* no need to invalidate: a not-present page won't be cached */ > > update_mmu_cache(vma, addr, vmf->pte); > > +skip: > > unlock_page(head); > > continue; > > unlock: > > first_map_page() or next_map_page() returns a page (if found) with > holding the refcount, and the new 'goto skip' path skips releasing it. > So this looks to me lead to the mismatch of refcount. > Could you explain the intention a little more (maybe related to your > recent patch about keeping hwpoison page in pagecache?) ? No, not related to my patches. The current code maps the subpages by PTEs *before* the poisoned page, but skips the subpages *after* the poisoned page IIUC. It seems not right, I thought the code was intended to map all subpages by PTEs except the poisoned pages. So the suggested code is trying to fix the misbehavior. That code is just a quick and untested illustration to the above hypothesis. The corrected version: diff --git a/mm/filemap.c b/mm/filemap.c index daa0e23a6ee6..1a76e3edc878 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -3317,8 +3317,11 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf, vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, addr, &vmf->ptl); do { page = find_subpage(head, xas.xa_index); - if (PageHWPoison(page)) - goto unlock; + if (PageHWPoison(page)) { + unlock_page(page); + put_page(page); + continue; + } if (mmap_miss > 0) mmap_miss--; > > Thanks, > Naoya Horiguchi > > > > > > > > > > > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> > > > --- > > > mm/filemap.c | 2 -- > > > 1 file changed, 2 deletions(-) > > > > > > diff --git a/mm/filemap.c b/mm/filemap.c > > > index 0b6f996108b4..65973204112d 100644 > > > --- a/mm/filemap.c > > > +++ b/mm/filemap.c > > > @@ -3239,8 +3239,6 @@ static struct page *next_uptodate_page(struct page *page, > > > goto skip; > > > if (!PageUptodate(page) || PageReadahead(page)) > > > goto skip; > > > - if (PageHWPoison(page)) > > > - goto skip; > > > if (!trylock_page(page)) > > > goto skip; > > > if (page->mapping != mapping) > > > -- > > > 2.33.0 > > >
On Tue, Nov 23, 2021 at 4:32 PM Yang Shi <shy828301@gmail.com> wrote: > > On Tue, Nov 23, 2021 at 4:11 PM HORIGUCHI NAOYA(堀口 直也) > <naoya.horiguchi@nec.com> wrote: > > > > On Mon, Nov 22, 2021 at 11:28:10AM -0800, Yang Shi wrote: > > > On Sat, Nov 20, 2021 at 9:44 AM Matthew Wilcox (Oracle) > > > <willy@infradead.org> wrote: > > > > > > > > Pages are individually marked as suffering from hardware poisoning. > > > > Checking that the head page is not hardware poisoned doesn't make > > > > sense; we might be after a subpage. We check each page individually > > > > before we use it, so this was an optimisation gone wrong. > > > > > > Yeah, it doesn't make too much sense to check the head page. And it > > > seems the non-poisoned subpages could be PTE mapped instead of > > > skipping the whole THP. > > > > > > Not sure if this is by design, it seems the hwpoisoned check in > > > filemap_map_pages() does skip the subpages after the poisoned page. Or > > > we should just skip the poisoned page itself? If so the below change > > > may be needed: > > > > > > diff --git a/mm/filemap.c b/mm/filemap.c > > > index daa0e23a6ee6..f1f0cb263b4a 100644 > > > --- a/mm/filemap.c > > > +++ b/mm/filemap.c > > > @@ -3318,7 +3318,7 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf, > > > do { > > > page = find_subpage(head, xas.xa_index); > > > if (PageHWPoison(page)) > > > - goto unlock; > > > + goto skip; > > > > > > if (mmap_miss > 0) > > > mmap_miss--; > > > @@ -3337,6 +3337,7 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf, > > > do_set_pte(vmf, page, addr); > > > /* no need to invalidate: a not-present page won't be cached */ > > > update_mmu_cache(vma, addr, vmf->pte); > > > +skip: > > > unlock_page(head); > > > continue; > > > unlock: > > > > first_map_page() or next_map_page() returns a page (if found) with > > holding the refcount, and the new 'goto skip' path skips releasing it. > > So this looks to me lead to the mismatch of refcount. > > Could you explain the intention a little more (maybe related to your > > recent patch about keeping hwpoison page in pagecache?) ? > > No, not related to my patches. > > The current code maps the subpages by PTEs *before* the poisoned page, > but skips the subpages *after* the poisoned page IIUC. It seems not > right, I thought the code was intended to map all subpages by PTEs > except the poisoned pages. So the suggested code is trying to fix the > misbehavior. Err... I think I misread the code. It does iterate every subpages to map each of them by PTE and just skip the hwpoisoned subpages. Sorry for the confusion. > > That code is just a quick and untested illustration to the above > hypothesis. The corrected version: > > diff --git a/mm/filemap.c b/mm/filemap.c > index daa0e23a6ee6..1a76e3edc878 100644 > --- a/mm/filemap.c > +++ b/mm/filemap.c > @@ -3317,8 +3317,11 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf, > vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, addr, &vmf->ptl); > do { > page = find_subpage(head, xas.xa_index); > - if (PageHWPoison(page)) > - goto unlock; > + if (PageHWPoison(page)) { > + unlock_page(page); > + put_page(page); > + continue; > + } > > if (mmap_miss > 0) > mmap_miss--; > > > > > Thanks, > > Naoya Horiguchi > > > > > > > > > > > > > > > > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> > > > > --- > > > > mm/filemap.c | 2 -- > > > > 1 file changed, 2 deletions(-) > > > > > > > > diff --git a/mm/filemap.c b/mm/filemap.c > > > > index 0b6f996108b4..65973204112d 100644 > > > > --- a/mm/filemap.c > > > > +++ b/mm/filemap.c > > > > @@ -3239,8 +3239,6 @@ static struct page *next_uptodate_page(struct page *page, > > > > goto skip; > > > > if (!PageUptodate(page) || PageReadahead(page)) > > > > goto skip; > > > > - if (PageHWPoison(page)) > > > > - goto skip; > > > > if (!trylock_page(page)) > > > > goto skip; > > > > if (page->mapping != mapping) > > > > -- > > > > 2.33.0 > > > >
On Sat, 20 Nov 2021 17:44:29 +0000 "Matthew Wilcox (Oracle)" <willy@infradead.org> wrote: > Pages are individually marked as suffering from hardware poisoning. > Checking that the head page is not hardware poisoned doesn't make > sense; we might be after a subpage. We check each page individually > before we use it, so this was an optimisation gone wrong. What were the runtime effects of this bug?
On Tue, Nov 23, 2021 at 07:24:42PM -0800, Andrew Morton wrote: > On Sat, 20 Nov 2021 17:44:29 +0000 "Matthew Wilcox (Oracle)" <willy@infradead.org> wrote: > > > Pages are individually marked as suffering from hardware poisoning. > > Checking that the head page is not hardware poisoned doesn't make > > sense; we might be after a subpage. We check each page individually > > before we use it, so this was an optimisation gone wrong. > > What were the runtime effects of this bug? We'd fall back to the slow path when there was no need to do that.
diff --git a/mm/filemap.c b/mm/filemap.c index 0b6f996108b4..65973204112d 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -3239,8 +3239,6 @@ static struct page *next_uptodate_page(struct page *page, goto skip; if (!PageUptodate(page) || PageReadahead(page)) goto skip; - if (PageHWPoison(page)) - goto skip; if (!trylock_page(page)) goto skip; if (page->mapping != mapping)
Pages are individually marked as suffering from hardware poisoning. Checking that the head page is not hardware poisoned doesn't make sense; we might be after a subpage. We check each page individually before we use it, so this was an optimisation gone wrong. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> --- mm/filemap.c | 2 -- 1 file changed, 2 deletions(-)