Message ID | 20211101194856.305642-1-shy828301@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [stable,5.10] mm: hwpoison: remove the unnecessary THP check | expand |
On Mon, Nov 01, 2021 at 12:48:56PM -0700, Yang Shi wrote: > commit c7cb42e94473aafe553c0f2a3d8ca904599399ed upstream. > > When handling THP hwpoison checked if the THP is in allocation or free > stage since hwpoison may mistreat it as hugetlb page. After commit > 415c64c1453a ("mm/memory-failure: split thp earlier in memory error > handling") the problem has been fixed, so this check is no longer > needed. Remove it. The side effect of the removal is hwpoison may > report unsplit THP instead of unknown error for shmem THP. It seems not > like a big deal. > > The following patch "mm: filemap: check if THP has hwpoisoned subpage > for PMD page fault" depends on this, which fixes shmem THP with > hwpoisoned subpage(s) are mapped PMD wrongly. So this patch needs to be > backported to -stable as well. > > Link: https://lkml.kernel.org/r/20211020210755.23964-2-shy828301@gmail.com > Signed-off-by: Yang Shi <shy828301@gmail.com> > Suggested-by: Naoya Horiguchi <naoya.horiguchi@nec.com> > Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com> > Cc: Hugh Dickins <hughd@google.com> > Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > Cc: Matthew Wilcox <willy@infradead.org> > Cc: Oscar Salvador <osalvador@suse.de> > Cc: Peter Xu <peterx@redhat.com> > Cc: <stable@vger.kernel.org> > Signed-off-by: Andrew Morton <akpm@linux-foundation.org> > Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> > --- > mm-filemap-check-if-thp-has-hwpoisoned-subpage-for-pmd-page-fault.patch > depends on this one. Both now queued up, thanks. greg k-h
On Wed, Nov 03, 2021 at 10:46:24AM +0100, Greg KH wrote: > On Mon, Nov 01, 2021 at 12:48:56PM -0700, Yang Shi wrote: > > commit c7cb42e94473aafe553c0f2a3d8ca904599399ed upstream. > > > > When handling THP hwpoison checked if the THP is in allocation or free > > stage since hwpoison may mistreat it as hugetlb page. After commit > > 415c64c1453a ("mm/memory-failure: split thp earlier in memory error > > handling") the problem has been fixed, so this check is no longer > > needed. Remove it. The side effect of the removal is hwpoison may > > report unsplit THP instead of unknown error for shmem THP. It seems not > > like a big deal. > > > > The following patch "mm: filemap: check if THP has hwpoisoned subpage > > for PMD page fault" depends on this, which fixes shmem THP with > > hwpoisoned subpage(s) are mapped PMD wrongly. So this patch needs to be > > backported to -stable as well. > > > > Link: https://lkml.kernel.org/r/20211020210755.23964-2-shy828301@gmail.com > > Signed-off-by: Yang Shi <shy828301@gmail.com> > > Suggested-by: Naoya Horiguchi <naoya.horiguchi@nec.com> > > Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com> > > Cc: Hugh Dickins <hughd@google.com> > > Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > > Cc: Matthew Wilcox <willy@infradead.org> > > Cc: Oscar Salvador <osalvador@suse.de> > > Cc: Peter Xu <peterx@redhat.com> > > Cc: <stable@vger.kernel.org> > > Signed-off-by: Andrew Morton <akpm@linux-foundation.org> > > Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> > > --- > > mm-filemap-check-if-thp-has-hwpoisoned-subpage-for-pmd-page-fault.patch > > depends on this one. > > Both now queued up, thanks. This breaks the build, see: https://lore.kernel.org/r/acabc414-164b-cd65-6a1a-cf912d8621d7@roeck-us.net so I'm going to drop both of these now. Please fix this up and resend a tested series. thanks, greg k-h
On Thu, Nov 4, 2021 at 9:53 AM Greg KH <gregkh@linuxfoundation.org> wrote: > > On Wed, Nov 03, 2021 at 10:46:24AM +0100, Greg KH wrote: > > On Mon, Nov 01, 2021 at 12:48:56PM -0700, Yang Shi wrote: > > > commit c7cb42e94473aafe553c0f2a3d8ca904599399ed upstream. > > > > > > When handling THP hwpoison checked if the THP is in allocation or free > > > stage since hwpoison may mistreat it as hugetlb page. After commit > > > 415c64c1453a ("mm/memory-failure: split thp earlier in memory error > > > handling") the problem has been fixed, so this check is no longer > > > needed. Remove it. The side effect of the removal is hwpoison may > > > report unsplit THP instead of unknown error for shmem THP. It seems not > > > like a big deal. > > > > > > The following patch "mm: filemap: check if THP has hwpoisoned subpage > > > for PMD page fault" depends on this, which fixes shmem THP with > > > hwpoisoned subpage(s) are mapped PMD wrongly. So this patch needs to be > > > backported to -stable as well. > > > > > > Link: https://lkml.kernel.org/r/20211020210755.23964-2-shy828301@gmail.com > > > Signed-off-by: Yang Shi <shy828301@gmail.com> > > > Suggested-by: Naoya Horiguchi <naoya.horiguchi@nec.com> > > > Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com> > > > Cc: Hugh Dickins <hughd@google.com> > > > Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > > > Cc: Matthew Wilcox <willy@infradead.org> > > > Cc: Oscar Salvador <osalvador@suse.de> > > > Cc: Peter Xu <peterx@redhat.com> > > > Cc: <stable@vger.kernel.org> > > > Signed-off-by: Andrew Morton <akpm@linux-foundation.org> > > > Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> > > > --- > > > mm-filemap-check-if-thp-has-hwpoisoned-subpage-for-pmd-page-fault.patch > > > depends on this one. > > > > Both now queued up, thanks. > > This breaks the build, see: > https://lore.kernel.org/r/acabc414-164b-cd65-6a1a-cf912d8621d7@roeck-us.net > > so I'm going to drop both of these now. Please fix this up and resend a > tested series. Thanks for catching this. It is because I accidentally left the PAGEFLAG_* macros into CONFIG_TRANSHUGE_PAGE section, so it is: #ifdef CONFIG_TRANSHUGE_PAGE ... #if defined(CONFIG_MEMORY_FAILURE) && defined(CONFIG_TRANSHUGE_PAGE) PAGEFLAG_xxx #else PAGEFLAG_FALSE_xxx #endif ... #endif So when THP is disabled the PAGEFLAG_FALSE_xxx macro is actually absent. The upstream has the same issue, will send a patch to fix it soon, and send fixes (folded the new fix in) to -stable later. Sorry for the inconvenience. > > thanks, > > greg k-h
On Thu, Nov 4, 2021 at 10:43 AM Yang Shi <shy828301@gmail.com> wrote: > > On Thu, Nov 4, 2021 at 9:53 AM Greg KH <gregkh@linuxfoundation.org> wrote: > > > > On Wed, Nov 03, 2021 at 10:46:24AM +0100, Greg KH wrote: > > > On Mon, Nov 01, 2021 at 12:48:56PM -0700, Yang Shi wrote: > > > > commit c7cb42e94473aafe553c0f2a3d8ca904599399ed upstream. > > > > > > > > When handling THP hwpoison checked if the THP is in allocation or free > > > > stage since hwpoison may mistreat it as hugetlb page. After commit > > > > 415c64c1453a ("mm/memory-failure: split thp earlier in memory error > > > > handling") the problem has been fixed, so this check is no longer > > > > needed. Remove it. The side effect of the removal is hwpoison may > > > > report unsplit THP instead of unknown error for shmem THP. It seems not > > > > like a big deal. > > > > > > > > The following patch "mm: filemap: check if THP has hwpoisoned subpage > > > > for PMD page fault" depends on this, which fixes shmem THP with > > > > hwpoisoned subpage(s) are mapped PMD wrongly. So this patch needs to be > > > > backported to -stable as well. > > > > > > > > Link: https://lkml.kernel.org/r/20211020210755.23964-2-shy828301@gmail.com > > > > Signed-off-by: Yang Shi <shy828301@gmail.com> > > > > Suggested-by: Naoya Horiguchi <naoya.horiguchi@nec.com> > > > > Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com> > > > > Cc: Hugh Dickins <hughd@google.com> > > > > Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > > > > Cc: Matthew Wilcox <willy@infradead.org> > > > > Cc: Oscar Salvador <osalvador@suse.de> > > > > Cc: Peter Xu <peterx@redhat.com> > > > > Cc: <stable@vger.kernel.org> > > > > Signed-off-by: Andrew Morton <akpm@linux-foundation.org> > > > > Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> > > > > --- > > > > mm-filemap-check-if-thp-has-hwpoisoned-subpage-for-pmd-page-fault.patch > > > > depends on this one. > > > > > > Both now queued up, thanks. > > > > This breaks the build, see: > > https://lore.kernel.org/r/acabc414-164b-cd65-6a1a-cf912d8621d7@roeck-us.net > > > > so I'm going to drop both of these now. Please fix this up and resend a > > tested series. > > Thanks for catching this. It is because I accidentally left the > PAGEFLAG_* macros into CONFIG_TRANSHUGE_PAGE section, so it is: > > #ifdef CONFIG_TRANSHUGE_PAGE > ... > #if defined(CONFIG_MEMORY_FAILURE) && defined(CONFIG_TRANSHUGE_PAGE) > PAGEFLAG_xxx > #else > PAGEFLAG_FALSE_xxx > #endif > ... > #endif > > So when THP is disabled the PAGEFLAG_FALSE_xxx macro is actually absent. > > The upstream has the same issue, will send a patch to fix it soon, and > send fixes (folded the new fix in) to -stable later. Sorry for the > inconvenience. Further looking shows the upstream is good. I did *NOT* add the code in CONFIG_TRANSHUGE_PAGE section. It seems the code section was moved around when the patch was applied to 5.10. Could you please fold the below patch into mm-filemap-check-if-thp-has-hwpoisoned-subpage-for-pmd-page-fault.patch? Or I could prepare a patch for you. diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index 0db6d83f70c3..1e33ba465195 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -676,20 +676,6 @@ static inline int PageTransCompoundMap(struct page *page) atomic_read(compound_mapcount_ptr(head)); } -#if defined(CONFIG_MEMORY_FAILURE) && defined(CONFIG_TRANSPARENT_HUGEPAGE) -/* - * PageHasHWPoisoned indicates that at least one subpage is hwpoisoned in the - * compound page. - * - * This flag is set by hwpoison handler. Cleared by THP split or free page. - */ -PAGEFLAG(HasHWPoisoned, has_hwpoisoned, PF_SECOND) - TESTSCFLAG(HasHWPoisoned, has_hwpoisoned, PF_SECOND) -#else -PAGEFLAG_FALSE(HasHWPoisoned) - TESTSCFLAG_FALSE(HasHWPoisoned) -#endif - /* * PageTransTail returns true for both transparent huge pages * and hugetlbfs pages, so it should only be called when it's known @@ -724,6 +710,20 @@ PAGEFLAG_FALSE(DoubleMap) TESTSCFLAG_FALSE(DoubleMap) #endif +#if defined(CONFIG_MEMORY_FAILURE) && defined(CONFIG_TRANSPARENT_HUGEPAGE) +/* + * PageHasHWPoisoned indicates that at least one subpage is hwpoisoned in the + * compound page. + * + * This flag is set by hwpoison handler. Cleared by THP split or free page. + */ +PAGEFLAG(HasHWPoisoned, has_hwpoisoned, PF_SECOND) + TESTSCFLAG(HasHWPoisoned, has_hwpoisoned, PF_SECOND) +#else +PAGEFLAG_FALSE(HasHWPoisoned) + TESTSCFLAG_FALSE(HasHWPoisoned) +#endif + /* * For pages that are never mapped to userspace (and aren't PageSlab), * page_type may be used. Because it is initialised to -1, we invert the > > > > > thanks, > > > > greg k-h
On Thu, Nov 04, 2021 at 11:07:05AM -0700, Yang Shi wrote: > On Thu, Nov 4, 2021 at 10:43 AM Yang Shi <shy828301@gmail.com> wrote: > > > > On Thu, Nov 4, 2021 at 9:53 AM Greg KH <gregkh@linuxfoundation.org> wrote: > > > > > > On Wed, Nov 03, 2021 at 10:46:24AM +0100, Greg KH wrote: > > > > On Mon, Nov 01, 2021 at 12:48:56PM -0700, Yang Shi wrote: > > > > > commit c7cb42e94473aafe553c0f2a3d8ca904599399ed upstream. > > > > > > > > > > When handling THP hwpoison checked if the THP is in allocation or free > > > > > stage since hwpoison may mistreat it as hugetlb page. After commit > > > > > 415c64c1453a ("mm/memory-failure: split thp earlier in memory error > > > > > handling") the problem has been fixed, so this check is no longer > > > > > needed. Remove it. The side effect of the removal is hwpoison may > > > > > report unsplit THP instead of unknown error for shmem THP. It seems not > > > > > like a big deal. > > > > > > > > > > The following patch "mm: filemap: check if THP has hwpoisoned subpage > > > > > for PMD page fault" depends on this, which fixes shmem THP with > > > > > hwpoisoned subpage(s) are mapped PMD wrongly. So this patch needs to be > > > > > backported to -stable as well. > > > > > > > > > > Link: https://lkml.kernel.org/r/20211020210755.23964-2-shy828301@gmail.com > > > > > Signed-off-by: Yang Shi <shy828301@gmail.com> > > > > > Suggested-by: Naoya Horiguchi <naoya.horiguchi@nec.com> > > > > > Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com> > > > > > Cc: Hugh Dickins <hughd@google.com> > > > > > Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > > > > > Cc: Matthew Wilcox <willy@infradead.org> > > > > > Cc: Oscar Salvador <osalvador@suse.de> > > > > > Cc: Peter Xu <peterx@redhat.com> > > > > > Cc: <stable@vger.kernel.org> > > > > > Signed-off-by: Andrew Morton <akpm@linux-foundation.org> > > > > > Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> > > > > > --- > > > > > mm-filemap-check-if-thp-has-hwpoisoned-subpage-for-pmd-page-fault.patch > > > > > depends on this one. > > > > > > > > Both now queued up, thanks. > > > > > > This breaks the build, see: > > > https://lore.kernel.org/r/acabc414-164b-cd65-6a1a-cf912d8621d7@roeck-us.net > > > > > > so I'm going to drop both of these now. Please fix this up and resend a > > > tested series. > > > > Thanks for catching this. It is because I accidentally left the > > PAGEFLAG_* macros into CONFIG_TRANSHUGE_PAGE section, so it is: > > > > #ifdef CONFIG_TRANSHUGE_PAGE > > ... > > #if defined(CONFIG_MEMORY_FAILURE) && defined(CONFIG_TRANSHUGE_PAGE) > > PAGEFLAG_xxx > > #else > > PAGEFLAG_FALSE_xxx > > #endif > > ... > > #endif > > > > So when THP is disabled the PAGEFLAG_FALSE_xxx macro is actually absent. > > > > The upstream has the same issue, will send a patch to fix it soon, and > > send fixes (folded the new fix in) to -stable later. Sorry for the > > inconvenience. > > Further looking shows the upstream is good. I did *NOT* add the code > in CONFIG_TRANSHUGE_PAGE section. It seems the code section was moved > around when the patch was applied to 5.10. > > Could you please fold the below patch into > mm-filemap-check-if-thp-has-hwpoisoned-subpage-for-pmd-page-fault.patch? > Or I could prepare a patch for you. I need a working patch series please. thanks, greg k-h
diff --git a/mm/memory-failure.c b/mm/memory-failure.c index 5d880d4eb9a2..bf601283fcf3 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -956,20 +956,6 @@ static int get_hwpoison_page(struct page *page) { struct page *head = compound_head(page); - if (!PageHuge(head) && PageTransHuge(head)) { - /* - * Non anonymous thp exists only in allocation/free time. We - * can't handle such a case correctly, so let's give it up. - * This should be better than triggering BUG_ON when kernel - * tries to touch the "partially handled" page. - */ - if (!PageAnon(head)) { - pr_err("Memory failure: %#lx: non anonymous thp\n", - page_to_pfn(page)); - return 0; - } - } - if (get_page_unless_zero(head)) { if (head == compound_head(page)) return 1;