diff mbox series

[stable,5.10] mm: hwpoison: remove the unnecessary THP check

Message ID 20211101194856.305642-1-shy828301@gmail.com (mailing list archive)
State New
Headers show
Series [stable,5.10] mm: hwpoison: remove the unnecessary THP check | expand

Commit Message

Yang Shi Nov. 1, 2021, 7:48 p.m. UTC
commit c7cb42e94473aafe553c0f2a3d8ca904599399ed upstream.

When handling THP hwpoison checked if the THP is in allocation or free
stage since hwpoison may mistreat it as hugetlb page.  After commit
415c64c1453a ("mm/memory-failure: split thp earlier in memory error
handling") the problem has been fixed, so this check is no longer
needed.  Remove it.  The side effect of the removal is hwpoison may
report unsplit THP instead of unknown error for shmem THP.  It seems not
like a big deal.

The following patch "mm: filemap: check if THP has hwpoisoned subpage
for PMD page fault" depends on this, which fixes shmem THP with
hwpoisoned subpage(s) are mapped PMD wrongly.  So this patch needs to be
backported to -stable as well.

Link: https://lkml.kernel.org/r/20211020210755.23964-2-shy828301@gmail.com
Signed-off-by: Yang Shi <shy828301@gmail.com>
Suggested-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Peter Xu <peterx@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
mm-filemap-check-if-thp-has-hwpoisoned-subpage-for-pmd-page-fault.patch
depends on this one.

 mm/memory-failure.c | 14 --------------
 1 file changed, 14 deletions(-)

Comments

Greg KH Nov. 3, 2021, 9:46 a.m. UTC | #1
On Mon, Nov 01, 2021 at 12:48:56PM -0700, Yang Shi wrote:
> commit c7cb42e94473aafe553c0f2a3d8ca904599399ed upstream.
> 
> When handling THP hwpoison checked if the THP is in allocation or free
> stage since hwpoison may mistreat it as hugetlb page.  After commit
> 415c64c1453a ("mm/memory-failure: split thp earlier in memory error
> handling") the problem has been fixed, so this check is no longer
> needed.  Remove it.  The side effect of the removal is hwpoison may
> report unsplit THP instead of unknown error for shmem THP.  It seems not
> like a big deal.
> 
> The following patch "mm: filemap: check if THP has hwpoisoned subpage
> for PMD page fault" depends on this, which fixes shmem THP with
> hwpoisoned subpage(s) are mapped PMD wrongly.  So this patch needs to be
> backported to -stable as well.
> 
> Link: https://lkml.kernel.org/r/20211020210755.23964-2-shy828301@gmail.com
> Signed-off-by: Yang Shi <shy828301@gmail.com>
> Suggested-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
> Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
> Cc: Hugh Dickins <hughd@google.com>
> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Cc: Matthew Wilcox <willy@infradead.org>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Peter Xu <peterx@redhat.com>
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
> ---
> mm-filemap-check-if-thp-has-hwpoisoned-subpage-for-pmd-page-fault.patch
> depends on this one.

Both now queued up, thanks.

greg k-h
Greg KH Nov. 4, 2021, 4:53 p.m. UTC | #2
On Wed, Nov 03, 2021 at 10:46:24AM +0100, Greg KH wrote:
> On Mon, Nov 01, 2021 at 12:48:56PM -0700, Yang Shi wrote:
> > commit c7cb42e94473aafe553c0f2a3d8ca904599399ed upstream.
> > 
> > When handling THP hwpoison checked if the THP is in allocation or free
> > stage since hwpoison may mistreat it as hugetlb page.  After commit
> > 415c64c1453a ("mm/memory-failure: split thp earlier in memory error
> > handling") the problem has been fixed, so this check is no longer
> > needed.  Remove it.  The side effect of the removal is hwpoison may
> > report unsplit THP instead of unknown error for shmem THP.  It seems not
> > like a big deal.
> > 
> > The following patch "mm: filemap: check if THP has hwpoisoned subpage
> > for PMD page fault" depends on this, which fixes shmem THP with
> > hwpoisoned subpage(s) are mapped PMD wrongly.  So this patch needs to be
> > backported to -stable as well.
> > 
> > Link: https://lkml.kernel.org/r/20211020210755.23964-2-shy828301@gmail.com
> > Signed-off-by: Yang Shi <shy828301@gmail.com>
> > Suggested-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
> > Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
> > Cc: Hugh Dickins <hughd@google.com>
> > Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > Cc: Matthew Wilcox <willy@infradead.org>
> > Cc: Oscar Salvador <osalvador@suse.de>
> > Cc: Peter Xu <peterx@redhat.com>
> > Cc: <stable@vger.kernel.org>
> > Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> > Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
> > ---
> > mm-filemap-check-if-thp-has-hwpoisoned-subpage-for-pmd-page-fault.patch
> > depends on this one.
> 
> Both now queued up, thanks.

This breaks the build, see:
	https://lore.kernel.org/r/acabc414-164b-cd65-6a1a-cf912d8621d7@roeck-us.net

so I'm going to drop both of these now.  Please fix this up and resend a
tested series.

thanks,

greg k-h
Yang Shi Nov. 4, 2021, 5:43 p.m. UTC | #3
On Thu, Nov 4, 2021 at 9:53 AM Greg KH <gregkh@linuxfoundation.org> wrote:
>
> On Wed, Nov 03, 2021 at 10:46:24AM +0100, Greg KH wrote:
> > On Mon, Nov 01, 2021 at 12:48:56PM -0700, Yang Shi wrote:
> > > commit c7cb42e94473aafe553c0f2a3d8ca904599399ed upstream.
> > >
> > > When handling THP hwpoison checked if the THP is in allocation or free
> > > stage since hwpoison may mistreat it as hugetlb page.  After commit
> > > 415c64c1453a ("mm/memory-failure: split thp earlier in memory error
> > > handling") the problem has been fixed, so this check is no longer
> > > needed.  Remove it.  The side effect of the removal is hwpoison may
> > > report unsplit THP instead of unknown error for shmem THP.  It seems not
> > > like a big deal.
> > >
> > > The following patch "mm: filemap: check if THP has hwpoisoned subpage
> > > for PMD page fault" depends on this, which fixes shmem THP with
> > > hwpoisoned subpage(s) are mapped PMD wrongly.  So this patch needs to be
> > > backported to -stable as well.
> > >
> > > Link: https://lkml.kernel.org/r/20211020210755.23964-2-shy828301@gmail.com
> > > Signed-off-by: Yang Shi <shy828301@gmail.com>
> > > Suggested-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
> > > Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
> > > Cc: Hugh Dickins <hughd@google.com>
> > > Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > > Cc: Matthew Wilcox <willy@infradead.org>
> > > Cc: Oscar Salvador <osalvador@suse.de>
> > > Cc: Peter Xu <peterx@redhat.com>
> > > Cc: <stable@vger.kernel.org>
> > > Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> > > Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
> > > ---
> > > mm-filemap-check-if-thp-has-hwpoisoned-subpage-for-pmd-page-fault.patch
> > > depends on this one.
> >
> > Both now queued up, thanks.
>
> This breaks the build, see:
>         https://lore.kernel.org/r/acabc414-164b-cd65-6a1a-cf912d8621d7@roeck-us.net
>
> so I'm going to drop both of these now.  Please fix this up and resend a
> tested series.

Thanks for catching this. It is because I accidentally left the
PAGEFLAG_* macros into CONFIG_TRANSHUGE_PAGE section, so it is:

#ifdef CONFIG_TRANSHUGE_PAGE
...
#if defined(CONFIG_MEMORY_FAILURE) && defined(CONFIG_TRANSHUGE_PAGE)
PAGEFLAG_xxx
#else
PAGEFLAG_FALSE_xxx
#endif
...
#endif

So when THP is disabled the PAGEFLAG_FALSE_xxx macro is actually absent.

The upstream has the same issue, will send a patch to fix it soon, and
send fixes (folded the new fix in) to -stable later. Sorry for the
inconvenience.

>
> thanks,
>
> greg k-h
Yang Shi Nov. 4, 2021, 6:07 p.m. UTC | #4
On Thu, Nov 4, 2021 at 10:43 AM Yang Shi <shy828301@gmail.com> wrote:
>
> On Thu, Nov 4, 2021 at 9:53 AM Greg KH <gregkh@linuxfoundation.org> wrote:
> >
> > On Wed, Nov 03, 2021 at 10:46:24AM +0100, Greg KH wrote:
> > > On Mon, Nov 01, 2021 at 12:48:56PM -0700, Yang Shi wrote:
> > > > commit c7cb42e94473aafe553c0f2a3d8ca904599399ed upstream.
> > > >
> > > > When handling THP hwpoison checked if the THP is in allocation or free
> > > > stage since hwpoison may mistreat it as hugetlb page.  After commit
> > > > 415c64c1453a ("mm/memory-failure: split thp earlier in memory error
> > > > handling") the problem has been fixed, so this check is no longer
> > > > needed.  Remove it.  The side effect of the removal is hwpoison may
> > > > report unsplit THP instead of unknown error for shmem THP.  It seems not
> > > > like a big deal.
> > > >
> > > > The following patch "mm: filemap: check if THP has hwpoisoned subpage
> > > > for PMD page fault" depends on this, which fixes shmem THP with
> > > > hwpoisoned subpage(s) are mapped PMD wrongly.  So this patch needs to be
> > > > backported to -stable as well.
> > > >
> > > > Link: https://lkml.kernel.org/r/20211020210755.23964-2-shy828301@gmail.com
> > > > Signed-off-by: Yang Shi <shy828301@gmail.com>
> > > > Suggested-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
> > > > Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
> > > > Cc: Hugh Dickins <hughd@google.com>
> > > > Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > > > Cc: Matthew Wilcox <willy@infradead.org>
> > > > Cc: Oscar Salvador <osalvador@suse.de>
> > > > Cc: Peter Xu <peterx@redhat.com>
> > > > Cc: <stable@vger.kernel.org>
> > > > Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> > > > Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
> > > > ---
> > > > mm-filemap-check-if-thp-has-hwpoisoned-subpage-for-pmd-page-fault.patch
> > > > depends on this one.
> > >
> > > Both now queued up, thanks.
> >
> > This breaks the build, see:
> >         https://lore.kernel.org/r/acabc414-164b-cd65-6a1a-cf912d8621d7@roeck-us.net
> >
> > so I'm going to drop both of these now.  Please fix this up and resend a
> > tested series.
>
> Thanks for catching this. It is because I accidentally left the
> PAGEFLAG_* macros into CONFIG_TRANSHUGE_PAGE section, so it is:
>
> #ifdef CONFIG_TRANSHUGE_PAGE
> ...
> #if defined(CONFIG_MEMORY_FAILURE) && defined(CONFIG_TRANSHUGE_PAGE)
> PAGEFLAG_xxx
> #else
> PAGEFLAG_FALSE_xxx
> #endif
> ...
> #endif
>
> So when THP is disabled the PAGEFLAG_FALSE_xxx macro is actually absent.
>
> The upstream has the same issue, will send a patch to fix it soon, and
> send fixes (folded the new fix in) to -stable later. Sorry for the
> inconvenience.

Further looking shows the upstream is good. I did *NOT* add the code
in CONFIG_TRANSHUGE_PAGE section. It seems the code section was moved
around when the patch was applied to 5.10.

Could you please fold the below patch into
mm-filemap-check-if-thp-has-hwpoisoned-subpage-for-pmd-page-fault.patch?
Or I could prepare a patch for you.

 diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 0db6d83f70c3..1e33ba465195 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -676,20 +676,6 @@ static inline int PageTransCompoundMap(struct page *page)
         atomic_read(compound_mapcount_ptr(head));
 }

-#if defined(CONFIG_MEMORY_FAILURE) && defined(CONFIG_TRANSPARENT_HUGEPAGE)
-/*
- * PageHasHWPoisoned indicates that at least one subpage is hwpoisoned in the
- * compound page.
- *
- * This flag is set by hwpoison handler.  Cleared by THP split or free page.
- */
-PAGEFLAG(HasHWPoisoned, has_hwpoisoned, PF_SECOND)
- TESTSCFLAG(HasHWPoisoned, has_hwpoisoned, PF_SECOND)
-#else
-PAGEFLAG_FALSE(HasHWPoisoned)
- TESTSCFLAG_FALSE(HasHWPoisoned)
-#endif
-
 /*
  * PageTransTail returns true for both transparent huge pages
  * and hugetlbfs pages, so it should only be called when it's known
@@ -724,6 +710,20 @@ PAGEFLAG_FALSE(DoubleMap)
  TESTSCFLAG_FALSE(DoubleMap)
 #endif

+#if defined(CONFIG_MEMORY_FAILURE) && defined(CONFIG_TRANSPARENT_HUGEPAGE)
+/*
+ * PageHasHWPoisoned indicates that at least one subpage is hwpoisoned in the
+ * compound page.
+ *
+ * This flag is set by hwpoison handler.  Cleared by THP split or free page.
+ */
+PAGEFLAG(HasHWPoisoned, has_hwpoisoned, PF_SECOND)
+ TESTSCFLAG(HasHWPoisoned, has_hwpoisoned, PF_SECOND)
+#else
+PAGEFLAG_FALSE(HasHWPoisoned)
+ TESTSCFLAG_FALSE(HasHWPoisoned)
+#endif
+
 /*
  * For pages that are never mapped to userspace (and aren't PageSlab),
  * page_type may be used.  Because it is initialised to -1, we invert the

>
> >
> > thanks,
> >
> > greg k-h
Greg KH Nov. 4, 2021, 6:50 p.m. UTC | #5
On Thu, Nov 04, 2021 at 11:07:05AM -0700, Yang Shi wrote:
> On Thu, Nov 4, 2021 at 10:43 AM Yang Shi <shy828301@gmail.com> wrote:
> >
> > On Thu, Nov 4, 2021 at 9:53 AM Greg KH <gregkh@linuxfoundation.org> wrote:
> > >
> > > On Wed, Nov 03, 2021 at 10:46:24AM +0100, Greg KH wrote:
> > > > On Mon, Nov 01, 2021 at 12:48:56PM -0700, Yang Shi wrote:
> > > > > commit c7cb42e94473aafe553c0f2a3d8ca904599399ed upstream.
> > > > >
> > > > > When handling THP hwpoison checked if the THP is in allocation or free
> > > > > stage since hwpoison may mistreat it as hugetlb page.  After commit
> > > > > 415c64c1453a ("mm/memory-failure: split thp earlier in memory error
> > > > > handling") the problem has been fixed, so this check is no longer
> > > > > needed.  Remove it.  The side effect of the removal is hwpoison may
> > > > > report unsplit THP instead of unknown error for shmem THP.  It seems not
> > > > > like a big deal.
> > > > >
> > > > > The following patch "mm: filemap: check if THP has hwpoisoned subpage
> > > > > for PMD page fault" depends on this, which fixes shmem THP with
> > > > > hwpoisoned subpage(s) are mapped PMD wrongly.  So this patch needs to be
> > > > > backported to -stable as well.
> > > > >
> > > > > Link: https://lkml.kernel.org/r/20211020210755.23964-2-shy828301@gmail.com
> > > > > Signed-off-by: Yang Shi <shy828301@gmail.com>
> > > > > Suggested-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
> > > > > Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
> > > > > Cc: Hugh Dickins <hughd@google.com>
> > > > > Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > > > > Cc: Matthew Wilcox <willy@infradead.org>
> > > > > Cc: Oscar Salvador <osalvador@suse.de>
> > > > > Cc: Peter Xu <peterx@redhat.com>
> > > > > Cc: <stable@vger.kernel.org>
> > > > > Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> > > > > Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
> > > > > ---
> > > > > mm-filemap-check-if-thp-has-hwpoisoned-subpage-for-pmd-page-fault.patch
> > > > > depends on this one.
> > > >
> > > > Both now queued up, thanks.
> > >
> > > This breaks the build, see:
> > >         https://lore.kernel.org/r/acabc414-164b-cd65-6a1a-cf912d8621d7@roeck-us.net
> > >
> > > so I'm going to drop both of these now.  Please fix this up and resend a
> > > tested series.
> >
> > Thanks for catching this. It is because I accidentally left the
> > PAGEFLAG_* macros into CONFIG_TRANSHUGE_PAGE section, so it is:
> >
> > #ifdef CONFIG_TRANSHUGE_PAGE
> > ...
> > #if defined(CONFIG_MEMORY_FAILURE) && defined(CONFIG_TRANSHUGE_PAGE)
> > PAGEFLAG_xxx
> > #else
> > PAGEFLAG_FALSE_xxx
> > #endif
> > ...
> > #endif
> >
> > So when THP is disabled the PAGEFLAG_FALSE_xxx macro is actually absent.
> >
> > The upstream has the same issue, will send a patch to fix it soon, and
> > send fixes (folded the new fix in) to -stable later. Sorry for the
> > inconvenience.
> 
> Further looking shows the upstream is good. I did *NOT* add the code
> in CONFIG_TRANSHUGE_PAGE section. It seems the code section was moved
> around when the patch was applied to 5.10.
> 
> Could you please fold the below patch into
> mm-filemap-check-if-thp-has-hwpoisoned-subpage-for-pmd-page-fault.patch?
> Or I could prepare a patch for you.

I need a working patch series please.

thanks,

greg k-h
diff mbox series

Patch

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 5d880d4eb9a2..bf601283fcf3 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -956,20 +956,6 @@  static int get_hwpoison_page(struct page *page)
 {
 	struct page *head = compound_head(page);
 
-	if (!PageHuge(head) && PageTransHuge(head)) {
-		/*
-		 * Non anonymous thp exists only in allocation/free time. We
-		 * can't handle such a case correctly, so let's give it up.
-		 * This should be better than triggering BUG_ON when kernel
-		 * tries to touch the "partially handled" page.
-		 */
-		if (!PageAnon(head)) {
-			pr_err("Memory failure: %#lx: non anonymous thp\n",
-				page_to_pfn(page));
-			return 0;
-		}
-	}
-
 	if (get_page_unless_zero(head)) {
 		if (head == compound_head(page))
 			return 1;