Message ID | 20220228140245.24552-5-linmiaohe@huawei.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | A few fixup patches for memory failure | expand |
On Mon, Feb 28, 2022 at 10:02:45PM +0800, Miaohe Lin wrote: > The huge zero page could reach here and if we ever try to split it, the > VM_BUG_ON_PAGE will be triggered in split_huge_page_to_list(). Also the > non-lru compound movable pages could be taken for transhuge pages. Skip > these pages by checking PageLRU because huge zero page isn't lru page as > non-lru compound movable pages. It seems that memory_failure() also fails at get_any_page() with "hwpoison: unhandlable page" message. [16478.203474] page:00000000b6acdbd1 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1810b4 [16478.206612] flags: 0x57ffffc0801000(reserved|hwpoison|node=1|zone=2|lastcpupid=0x1fffff) [16478.209411] raw: 0057ffffc0801000 fffff11bc6042d08 fffff11bc6042d08 0000000000000000 [16478.211921] raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000 [16478.214473] page dumped because: hwpoison: unhandlable page [16478.216386] Memory failure: 0x1810b4: recovery action for unknown page: Ignored We can't handle errors on huge (or normal) zero page, so the current behavior seems to me more suitable than "unsplit thp". Or if you have some producer to reach the following path with huge zero page, could you share it? Thanks, Naoya Horiguchi > > Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> > --- > mm/memory-failure.c | 14 ++++++++++++++ > 1 file changed, 14 insertions(+) > > diff --git a/mm/memory-failure.c b/mm/memory-failure.c > index 23bfd809dc8c..ac6492e36978 100644 > --- a/mm/memory-failure.c > +++ b/mm/memory-failure.c > @@ -1792,6 +1792,20 @@ int memory_failure(unsigned long pfn, int flags) > } > > if (PageTransHuge(hpage)) { > + /* > + * The non-lru compound movable pages could be taken for > + * transhuge pages. Also huge zero page could reach here > + * and if we ever try to split it, the VM_BUG_ON_PAGE will > + * be triggered in split_huge_page_to_list(). Skip these > + * pages by checking PageLRU because huge zero page isn't > + * lru page as non-lru compound movable pages. > + */ > + if (!PageLRU(hpage)) { > + put_page(p); > + action_result(pfn, MF_MSG_UNSPLIT_THP, MF_IGNORED); > + res = -EBUSY; > + goto unlock_mutex; > + } > /* > * The flag must be set after the refcount is bumped > * otherwise it may race with THP split. > -- > 2.23.0
On 2022/3/4 16:28, HORIGUCHI NAOYA(堀口 直也) wrote: > On Mon, Feb 28, 2022 at 10:02:45PM +0800, Miaohe Lin wrote: >> The huge zero page could reach here and if we ever try to split it, the >> VM_BUG_ON_PAGE will be triggered in split_huge_page_to_list(). Also the >> non-lru compound movable pages could be taken for transhuge pages. Skip >> these pages by checking PageLRU because huge zero page isn't lru page as >> non-lru compound movable pages. > > It seems that memory_failure() also fails at get_any_page() with "hwpoison: > unhandlable page" message. > > [16478.203474] page:00000000b6acdbd1 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1810b4 > [16478.206612] flags: 0x57ffffc0801000(reserved|hwpoison|node=1|zone=2|lastcpupid=0x1fffff) > [16478.209411] raw: 0057ffffc0801000 fffff11bc6042d08 fffff11bc6042d08 0000000000000000 > [16478.211921] raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000 > [16478.214473] page dumped because: hwpoison: unhandlable page > [16478.216386] Memory failure: 0x1810b4: recovery action for unknown page: Ignored > > We can't handle errors on huge (or normal) zero page, so the current Sorry for confusing commit log again. I should have a coffee before I make this patch. Huge or normal zero page will fail at get_any_page because they're neither HWPoisonHandlable nor PageHuge. > behavior seems to me more suitable than "unsplit thp". > > Or if you have some producer to reach the following path with huge zero > page, could you share it? > What I mean is that non-lru movable compound page can reach here unexpected because __PageMovable(page) is handleable now. So get_any_page could succeed to grab the page refcnt. And since it's compound page, it will go through the split_huge_page_to_list because PageTransHuge checks PageHead(page) which can also be true for compound page. But this type of pages is unexpected for split_huge_page_to_list. Does this make sense for you? Thanks Naoya. > Thanks, > Naoya Horiguchi > >> >> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> >> --- >> mm/memory-failure.c | 14 ++++++++++++++ >> 1 file changed, 14 insertions(+) >> >> diff --git a/mm/memory-failure.c b/mm/memory-failure.c >> index 23bfd809dc8c..ac6492e36978 100644 >> --- a/mm/memory-failure.c >> +++ b/mm/memory-failure.c >> @@ -1792,6 +1792,20 @@ int memory_failure(unsigned long pfn, int flags) >> } >> >> if (PageTransHuge(hpage)) { >> + /* >> + * The non-lru compound movable pages could be taken for >> + * transhuge pages. Also huge zero page could reach here >> + * and if we ever try to split it, the VM_BUG_ON_PAGE will >> + * be triggered in split_huge_page_to_list(). Skip these >> + * pages by checking PageLRU because huge zero page isn't >> + * lru page as non-lru compound movable pages. >> + */ >> + if (!PageLRU(hpage)) { >> + put_page(p); >> + action_result(pfn, MF_MSG_UNSPLIT_THP, MF_IGNORED); >> + res = -EBUSY; >> + goto unlock_mutex; >> + } >> /* >> * The flag must be set after the refcount is bumped >> * otherwise it may race with THP split. >> -- >> 2.23.0
On Sun, Mar 6, 2022 at 11:07 PM Miaohe Lin <linmiaohe@huawei.com> wrote: > > On 2022/3/4 16:28, HORIGUCHI NAOYA(堀口 直也) wrote: > > On Mon, Feb 28, 2022 at 10:02:45PM +0800, Miaohe Lin wrote: > >> The huge zero page could reach here and if we ever try to split it, the > >> VM_BUG_ON_PAGE will be triggered in split_huge_page_to_list(). Also the > >> non-lru compound movable pages could be taken for transhuge pages. Skip > >> these pages by checking PageLRU because huge zero page isn't lru page as > >> non-lru compound movable pages. > > > > It seems that memory_failure() also fails at get_any_page() with "hwpoison: > > unhandlable page" message. > > > > [16478.203474] page:00000000b6acdbd1 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1810b4 > > [16478.206612] flags: 0x57ffffc0801000(reserved|hwpoison|node=1|zone=2|lastcpupid=0x1fffff) > > [16478.209411] raw: 0057ffffc0801000 fffff11bc6042d08 fffff11bc6042d08 0000000000000000 > > [16478.211921] raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000 > > [16478.214473] page dumped because: hwpoison: unhandlable page > > [16478.216386] Memory failure: 0x1810b4: recovery action for unknown page: Ignored > > > > We can't handle errors on huge (or normal) zero page, so the current > > Sorry for confusing commit log again. I should have a coffee before I make this patch. > Huge or normal zero page will fail at get_any_page because they're neither HWPoisonHandlable > nor PageHuge. > > > behavior seems to me more suitable than "unsplit thp". > > > > Or if you have some producer to reach the following path with huge zero > > page, could you share it? > > > > What I mean is that non-lru movable compound page can reach here unexpected because __PageMovable(page) > is handleable now. So get_any_page could succeed to grab the page refcnt. And since it's compound page, > it will go through the split_huge_page_to_list because PageTransHuge checks PageHead(page) which can also > be true for compound page. But this type of pages is unexpected for split_huge_page_to_list. Can we really handle non-LRU movable pages in memory failure (uncorrectable errors)? Typically they are balloon, zsmalloc, etc. Assuming we run into a base (4K) non-LRU movable page, we could reach as far as identify_page_state(), it should not fall into any category except me_unknown. So it seems we could just simply make it unhandlable. But it should be handlable for soft-offline since it could be migrated. > Does this make sense for you? Thanks Naoya. > > > Thanks, > > Naoya Horiguchi > > > >> > >> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> > >> --- > >> mm/memory-failure.c | 14 ++++++++++++++ > >> 1 file changed, 14 insertions(+) > >> > >> diff --git a/mm/memory-failure.c b/mm/memory-failure.c > >> index 23bfd809dc8c..ac6492e36978 100644 > >> --- a/mm/memory-failure.c > >> +++ b/mm/memory-failure.c > >> @@ -1792,6 +1792,20 @@ int memory_failure(unsigned long pfn, int flags) > >> } > >> > >> if (PageTransHuge(hpage)) { > >> + /* > >> + * The non-lru compound movable pages could be taken for > >> + * transhuge pages. Also huge zero page could reach here > >> + * and if we ever try to split it, the VM_BUG_ON_PAGE will > >> + * be triggered in split_huge_page_to_list(). Skip these > >> + * pages by checking PageLRU because huge zero page isn't > >> + * lru page as non-lru compound movable pages. > >> + */ > >> + if (!PageLRU(hpage)) { > >> + put_page(p); > >> + action_result(pfn, MF_MSG_UNSPLIT_THP, MF_IGNORED); > >> + res = -EBUSY; > >> + goto unlock_mutex; > >> + } > >> /* > >> * The flag must be set after the refcount is bumped > >> * otherwise it may race with THP split. > >> -- > >> 2.23.0 > >
On 2022/3/8 3:53, Yang Shi wrote: > On Sun, Mar 6, 2022 at 11:07 PM Miaohe Lin <linmiaohe@huawei.com> wrote: >> >> On 2022/3/4 16:28, HORIGUCHI NAOYA(堀口 直也) wrote: >>> On Mon, Feb 28, 2022 at 10:02:45PM +0800, Miaohe Lin wrote: >>>> The huge zero page could reach here and if we ever try to split it, the >>>> VM_BUG_ON_PAGE will be triggered in split_huge_page_to_list(). Also the >>>> non-lru compound movable pages could be taken for transhuge pages. Skip >>>> these pages by checking PageLRU because huge zero page isn't lru page as >>>> non-lru compound movable pages. >>> >>> It seems that memory_failure() also fails at get_any_page() with "hwpoison: >>> unhandlable page" message. >>> >>> [16478.203474] page:00000000b6acdbd1 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1810b4 >>> [16478.206612] flags: 0x57ffffc0801000(reserved|hwpoison|node=1|zone=2|lastcpupid=0x1fffff) >>> [16478.209411] raw: 0057ffffc0801000 fffff11bc6042d08 fffff11bc6042d08 0000000000000000 >>> [16478.211921] raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000 >>> [16478.214473] page dumped because: hwpoison: unhandlable page >>> [16478.216386] Memory failure: 0x1810b4: recovery action for unknown page: Ignored >>> >>> We can't handle errors on huge (or normal) zero page, so the current >> >> Sorry for confusing commit log again. I should have a coffee before I make this patch. >> Huge or normal zero page will fail at get_any_page because they're neither HWPoisonHandlable >> nor PageHuge. >> >>> behavior seems to me more suitable than "unsplit thp". >>> >>> Or if you have some producer to reach the following path with huge zero >>> page, could you share it? >>> >> >> What I mean is that non-lru movable compound page can reach here unexpected because __PageMovable(page) >> is handleable now. So get_any_page could succeed to grab the page refcnt. And since it's compound page, >> it will go through the split_huge_page_to_list because PageTransHuge checks PageHead(page) which can also >> be true for compound page. But this type of pages is unexpected for split_huge_page_to_list. > > Can we really handle non-LRU movable pages in memory failure > (uncorrectable errors)? Typically they are balloon, zsmalloc, etc. > Assuming we run into a base (4K) non-LRU movable page, we could reach > as far as identify_page_state(), it should not fall into any category > except me_unknown. So it seems we could just simply make it > unhandlable. There is the comment from memory_failure: /* * We ignore non-LRU pages for good reasons. * - PG_locked is only well defined for LRU pages and a few others * - to avoid races with __SetPageLocked() * - to avoid races with __SetPageSlab*() (and more non-atomic ops) * The check (unnecessarily) ignores LRU pages being isolated and * walked by the page reclaim code, however that's not a big loss. */ So we could not handle non-LRU movable pages. What do you mean is something like below? diff --git a/mm/memory-failure.c b/mm/memory-failure.c index 5444a8ef4867..d80dbe0f20b6 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -1784,6 +1784,13 @@ int memory_failure(unsigned long pfn, int flags) } } + if (__PageMovable(hpage)) { + put_page(p); + action_result(pfn, MF_MSG_MOVALBE_PAGE, MF_IGNORED); + res = -EBUSY; + goto unlock_mutex; + } + if (PageTransHuge(hpage)) { /* * The flag must be set after the refcount is bumped i.e. Simply make non-LRU movable pages unhandlable ? > > But it should be handlable for soft-offline since it could be migrated. > Yes, non-LRU movable pages can be simply migrated. Many thanks. > >> Does this make sense for you? Thanks Naoya. >> >>> Thanks, >>> Naoya Horiguchi >>> >>>> >>>> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> >>>> --- >>>> mm/memory-failure.c | 14 ++++++++++++++ >>>> 1 file changed, 14 insertions(+) >>>> >>>> diff --git a/mm/memory-failure.c b/mm/memory-failure.c >>>> index 23bfd809dc8c..ac6492e36978 100644 >>>> --- a/mm/memory-failure.c >>>> +++ b/mm/memory-failure.c >>>> @@ -1792,6 +1792,20 @@ int memory_failure(unsigned long pfn, int flags) >>>> } >>>> >>>> if (PageTransHuge(hpage)) { >>>> + /* >>>> + * The non-lru compound movable pages could be taken for >>>> + * transhuge pages. Also huge zero page could reach here >>>> + * and if we ever try to split it, the VM_BUG_ON_PAGE will >>>> + * be triggered in split_huge_page_to_list(). Skip these >>>> + * pages by checking PageLRU because huge zero page isn't >>>> + * lru page as non-lru compound movable pages. >>>> + */ >>>> + if (!PageLRU(hpage)) { >>>> + put_page(p); >>>> + action_result(pfn, MF_MSG_UNSPLIT_THP, MF_IGNORED); >>>> + res = -EBUSY; >>>> + goto unlock_mutex; >>>> + } >>>> /* >>>> * The flag must be set after the refcount is bumped >>>> * otherwise it may race with THP split. >>>> -- >>>> 2.23.0 >> >> > . >
On Tue, Mar 8, 2022 at 4:36 AM Miaohe Lin <linmiaohe@huawei.com> wrote: > > On 2022/3/8 3:53, Yang Shi wrote: > > On Sun, Mar 6, 2022 at 11:07 PM Miaohe Lin <linmiaohe@huawei.com> wrote: > >> > >> On 2022/3/4 16:28, HORIGUCHI NAOYA(堀口 直也) wrote: > >>> On Mon, Feb 28, 2022 at 10:02:45PM +0800, Miaohe Lin wrote: > >>>> The huge zero page could reach here and if we ever try to split it, the > >>>> VM_BUG_ON_PAGE will be triggered in split_huge_page_to_list(). Also the > >>>> non-lru compound movable pages could be taken for transhuge pages. Skip > >>>> these pages by checking PageLRU because huge zero page isn't lru page as > >>>> non-lru compound movable pages. > >>> > >>> It seems that memory_failure() also fails at get_any_page() with "hwpoison: > >>> unhandlable page" message. > >>> > >>> [16478.203474] page:00000000b6acdbd1 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1810b4 > >>> [16478.206612] flags: 0x57ffffc0801000(reserved|hwpoison|node=1|zone=2|lastcpupid=0x1fffff) > >>> [16478.209411] raw: 0057ffffc0801000 fffff11bc6042d08 fffff11bc6042d08 0000000000000000 > >>> [16478.211921] raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000 > >>> [16478.214473] page dumped because: hwpoison: unhandlable page > >>> [16478.216386] Memory failure: 0x1810b4: recovery action for unknown page: Ignored > >>> > >>> We can't handle errors on huge (or normal) zero page, so the current > >> > >> Sorry for confusing commit log again. I should have a coffee before I make this patch. > >> Huge or normal zero page will fail at get_any_page because they're neither HWPoisonHandlable > >> nor PageHuge. > >> > >>> behavior seems to me more suitable than "unsplit thp". > >>> > >>> Or if you have some producer to reach the following path with huge zero > >>> page, could you share it? > >>> > >> > >> What I mean is that non-lru movable compound page can reach here unexpected because __PageMovable(page) > >> is handleable now. So get_any_page could succeed to grab the page refcnt. And since it's compound page, > >> it will go through the split_huge_page_to_list because PageTransHuge checks PageHead(page) which can also > >> be true for compound page. But this type of pages is unexpected for split_huge_page_to_list. > > > > Can we really handle non-LRU movable pages in memory failure > > (uncorrectable errors)? Typically they are balloon, zsmalloc, etc. > > Assuming we run into a base (4K) non-LRU movable page, we could reach > > as far as identify_page_state(), it should not fall into any category > > except me_unknown. So it seems we could just simply make it > > unhandlable. > > There is the comment from memory_failure: > /* > * We ignore non-LRU pages for good reasons. > * - PG_locked is only well defined for LRU pages and a few others > * - to avoid races with __SetPageLocked() > * - to avoid races with __SetPageSlab*() (and more non-atomic ops) > * The check (unnecessarily) ignores LRU pages being isolated and > * walked by the page reclaim code, however that's not a big loss. > */ > > So we could not handle non-LRU movable pages. > > What do you mean is something like below? > > diff --git a/mm/memory-failure.c b/mm/memory-failure.c > index 5444a8ef4867..d80dbe0f20b6 100644 > --- a/mm/memory-failure.c > +++ b/mm/memory-failure.c > @@ -1784,6 +1784,13 @@ int memory_failure(unsigned long pfn, int flags) > } > } > > + if (__PageMovable(hpage)) { > + put_page(p); > + action_result(pfn, MF_MSG_MOVALBE_PAGE, MF_IGNORED); > + res = -EBUSY; > + goto unlock_mutex; > + } > + > if (PageTransHuge(hpage)) { > /* > * The flag must be set after the refcount is bumped > > > i.e. Simply make non-LRU movable pages unhandlable ? I'd prefer this personally. Something like the below (compile test only): diff --git a/mm/memory-failure.c b/mm/memory-failure.c index 5444a8ef4867..789e40909ade 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -1176,12 +1176,18 @@ void ClearPageHWPoisonTakenOff(struct page *page) * does not return true for hugetlb or device memory pages, so it's assumed * to be called only in the context where we never have such pages. */ -static inline bool HWPoisonHandlable(struct page *page) +static inline bool HWPoisonHandlable(struct page *page, unsigned long flags) { - return PageLRU(page) || __PageMovable(page) || is_free_buddy_page(page); + bool movable = false; + + /* Soft offline could mirgate non-LRU movable pages */ + if ((flags & MF_SOFT_OFFLINE) && __PageMovable(page)) + movable = true; + + return movable || PageLRU(page) || is_free_buddy_page(page); } -static int __get_hwpoison_page(struct page *page) +static int __get_hwpoison_page(struct page *page, unsigned long flags) { struct page *head = compound_head(page); int ret = 0; @@ -1196,7 +1202,7 @@ static int __get_hwpoison_page(struct page *page) * for any unsupported type of page in order to reduce the risk of * unexpected races caused by taking a page refcount. */ - if (!HWPoisonHandlable(head)) + if (!HWPoisonHandlable(head, flags)) return -EBUSY; if (get_page_unless_zero(head)) { @@ -1221,7 +1227,7 @@ static int get_any_page(struct page *p, unsigned long flags) try_again: if (!count_increased) { - ret = __get_hwpoison_page(p); + ret = __get_hwpoison_page(p, flags); if (!ret) { if (page_count(p)) { /* We raced with an allocation, retry. */ @@ -1249,7 +1255,7 @@ static int get_any_page(struct page *p, unsigned long flags) } } - if (PageHuge(p) || HWPoisonHandlable(p)) { + if (PageHuge(p) || HWPoisonHandlable(p, flags)) { ret = 1; } else { /* > > > > > But it should be handlable for soft-offline since it could be migrated. > > > > Yes, non-LRU movable pages can be simply migrated. > > Many thanks. > > > > >> Does this make sense for you? Thanks Naoya. > >> > >>> Thanks, > >>> Naoya Horiguchi > >>> > >>>> > >>>> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> > >>>> --- > >>>> mm/memory-failure.c | 14 ++++++++++++++ > >>>> 1 file changed, 14 insertions(+) > >>>> > >>>> diff --git a/mm/memory-failure.c b/mm/memory-failure.c > >>>> index 23bfd809dc8c..ac6492e36978 100644 > >>>> --- a/mm/memory-failure.c > >>>> +++ b/mm/memory-failure.c > >>>> @@ -1792,6 +1792,20 @@ int memory_failure(unsigned long pfn, int flags) > >>>> } > >>>> > >>>> if (PageTransHuge(hpage)) { > >>>> + /* > >>>> + * The non-lru compound movable pages could be taken for > >>>> + * transhuge pages. Also huge zero page could reach here > >>>> + * and if we ever try to split it, the VM_BUG_ON_PAGE will > >>>> + * be triggered in split_huge_page_to_list(). Skip these > >>>> + * pages by checking PageLRU because huge zero page isn't > >>>> + * lru page as non-lru compound movable pages. > >>>> + */ > >>>> + if (!PageLRU(hpage)) { > >>>> + put_page(p); > >>>> + action_result(pfn, MF_MSG_UNSPLIT_THP, MF_IGNORED); > >>>> + res = -EBUSY; > >>>> + goto unlock_mutex; > >>>> + } > >>>> /* > >>>> * The flag must be set after the refcount is bumped > >>>> * otherwise it may race with THP split. > >>>> -- > >>>> 2.23.0 > >> > >> > > . > > >
On 2022/3/9 2:47, Yang Shi wrote: > On Tue, Mar 8, 2022 at 4:36 AM Miaohe Lin <linmiaohe@huawei.com> wrote: >> >> On 2022/3/8 3:53, Yang Shi wrote: >>> On Sun, Mar 6, 2022 at 11:07 PM Miaohe Lin <linmiaohe@huawei.com> wrote: >>>> >>>> On 2022/3/4 16:28, HORIGUCHI NAOYA(堀口 直也) wrote: >>>>> On Mon, Feb 28, 2022 at 10:02:45PM +0800, Miaohe Lin wrote: >>>>>> The huge zero page could reach here and if we ever try to split it, the >>>>>> VM_BUG_ON_PAGE will be triggered in split_huge_page_to_list(). Also the >>>>>> non-lru compound movable pages could be taken for transhuge pages. Skip >>>>>> these pages by checking PageLRU because huge zero page isn't lru page as >>>>>> non-lru compound movable pages. >>>>> >>>>> It seems that memory_failure() also fails at get_any_page() with "hwpoison: >>>>> unhandlable page" message. >>>>> >>>>> [16478.203474] page:00000000b6acdbd1 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1810b4 >>>>> [16478.206612] flags: 0x57ffffc0801000(reserved|hwpoison|node=1|zone=2|lastcpupid=0x1fffff) >>>>> [16478.209411] raw: 0057ffffc0801000 fffff11bc6042d08 fffff11bc6042d08 0000000000000000 >>>>> [16478.211921] raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000 >>>>> [16478.214473] page dumped because: hwpoison: unhandlable page >>>>> [16478.216386] Memory failure: 0x1810b4: recovery action for unknown page: Ignored >>>>> >>>>> We can't handle errors on huge (or normal) zero page, so the current >>>> >>>> Sorry for confusing commit log again. I should have a coffee before I make this patch. >>>> Huge or normal zero page will fail at get_any_page because they're neither HWPoisonHandlable >>>> nor PageHuge. >>>> >>>>> behavior seems to me more suitable than "unsplit thp". >>>>> >>>>> Or if you have some producer to reach the following path with huge zero >>>>> page, could you share it? >>>>> >>>> >>>> What I mean is that non-lru movable compound page can reach here unexpected because __PageMovable(page) >>>> is handleable now. So get_any_page could succeed to grab the page refcnt. And since it's compound page, >>>> it will go through the split_huge_page_to_list because PageTransHuge checks PageHead(page) which can also >>>> be true for compound page. But this type of pages is unexpected for split_huge_page_to_list. >>> >>> Can we really handle non-LRU movable pages in memory failure >>> (uncorrectable errors)? Typically they are balloon, zsmalloc, etc. >>> Assuming we run into a base (4K) non-LRU movable page, we could reach >>> as far as identify_page_state(), it should not fall into any category >>> except me_unknown. So it seems we could just simply make it >>> unhandlable. >> >> There is the comment from memory_failure: >> /* >> * We ignore non-LRU pages for good reasons. >> * - PG_locked is only well defined for LRU pages and a few others >> * - to avoid races with __SetPageLocked() >> * - to avoid races with __SetPageSlab*() (and more non-atomic ops) >> * The check (unnecessarily) ignores LRU pages being isolated and >> * walked by the page reclaim code, however that's not a big loss. >> */ >> >> So we could not handle non-LRU movable pages. >> >> What do you mean is something like below? >> >> diff --git a/mm/memory-failure.c b/mm/memory-failure.c >> index 5444a8ef4867..d80dbe0f20b6 100644 >> --- a/mm/memory-failure.c >> +++ b/mm/memory-failure.c >> @@ -1784,6 +1784,13 @@ int memory_failure(unsigned long pfn, int flags) >> } >> } >> >> + if (__PageMovable(hpage)) { >> + put_page(p); >> + action_result(pfn, MF_MSG_MOVALBE_PAGE, MF_IGNORED); >> + res = -EBUSY; >> + goto unlock_mutex; >> + } >> + >> if (PageTransHuge(hpage)) { >> /* >> * The flag must be set after the refcount is bumped >> >> >> i.e. Simply make non-LRU movable pages unhandlable ? > The below change looks good to me except we need to ensure the caller passing in the MF_SOFT_OFFLINE flag while doing soft-offline now. This is false for madvise_inject_error but it's trivial to change this. Will try to do this in V2. Many thanks. > I'd prefer this personally. Something like the below (compile test only): > > diff --git a/mm/memory-failure.c b/mm/memory-failure.c > index 5444a8ef4867..789e40909ade 100644 > --- a/mm/memory-failure.c > +++ b/mm/memory-failure.c > @@ -1176,12 +1176,18 @@ void ClearPageHWPoisonTakenOff(struct page *page) > * does not return true for hugetlb or device memory pages, so it's assumed > * to be called only in the context where we never have such pages. > */ > -static inline bool HWPoisonHandlable(struct page *page) > +static inline bool HWPoisonHandlable(struct page *page, unsigned long flags) > { > - return PageLRU(page) || __PageMovable(page) || is_free_buddy_page(page); > + bool movable = false; > + > + /* Soft offline could mirgate non-LRU movable pages */ > + if ((flags & MF_SOFT_OFFLINE) && __PageMovable(page)) > + movable = true; > + > + return movable || PageLRU(page) || is_free_buddy_page(page); > } > > -static int __get_hwpoison_page(struct page *page) > +static int __get_hwpoison_page(struct page *page, unsigned long flags) > { > struct page *head = compound_head(page); > int ret = 0; > @@ -1196,7 +1202,7 @@ static int __get_hwpoison_page(struct page *page) > * for any unsupported type of page in order to reduce the risk of > * unexpected races caused by taking a page refcount. > */ > - if (!HWPoisonHandlable(head)) > + if (!HWPoisonHandlable(head, flags)) > return -EBUSY; > > if (get_page_unless_zero(head)) { > @@ -1221,7 +1227,7 @@ static int get_any_page(struct page *p, unsigned > long flags) > > try_again: > if (!count_increased) { > - ret = __get_hwpoison_page(p); > + ret = __get_hwpoison_page(p, flags); > if (!ret) { > if (page_count(p)) { > /* We raced with an allocation, retry. */ > @@ -1249,7 +1255,7 @@ static int get_any_page(struct page *p, unsigned > long flags) > } > } > > - if (PageHuge(p) || HWPoisonHandlable(p)) { > + if (PageHuge(p) || HWPoisonHandlable(p, flags)) { > ret = 1; > } else { > /* > >> >>> >>> But it should be handlable for soft-offline since it could be migrated. >>> >> >> Yes, non-LRU movable pages can be simply migrated. >> >> Many thanks. >> >>> >>>> Does this make sense for you? Thanks Naoya. >>>> >>>>> Thanks, >>>>> Naoya Horiguchi >>>>> >>>>>> >>>>>> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> >>>>>> --- >>>>>> mm/memory-failure.c | 14 ++++++++++++++ >>>>>> 1 file changed, 14 insertions(+) >>>>>> >>>>>> diff --git a/mm/memory-failure.c b/mm/memory-failure.c >>>>>> index 23bfd809dc8c..ac6492e36978 100644 >>>>>> --- a/mm/memory-failure.c >>>>>> +++ b/mm/memory-failure.c >>>>>> @@ -1792,6 +1792,20 @@ int memory_failure(unsigned long pfn, int flags) >>>>>> } >>>>>> >>>>>> if (PageTransHuge(hpage)) { >>>>>> + /* >>>>>> + * The non-lru compound movable pages could be taken for >>>>>> + * transhuge pages. Also huge zero page could reach here >>>>>> + * and if we ever try to split it, the VM_BUG_ON_PAGE will >>>>>> + * be triggered in split_huge_page_to_list(). Skip these >>>>>> + * pages by checking PageLRU because huge zero page isn't >>>>>> + * lru page as non-lru compound movable pages. >>>>>> + */ >>>>>> + if (!PageLRU(hpage)) { >>>>>> + put_page(p); >>>>>> + action_result(pfn, MF_MSG_UNSPLIT_THP, MF_IGNORED); >>>>>> + res = -EBUSY; >>>>>> + goto unlock_mutex; >>>>>> + } >>>>>> /* >>>>>> * The flag must be set after the refcount is bumped >>>>>> * otherwise it may race with THP split. >>>>>> -- >>>>>> 2.23.0 >>>> >>>> >>> . >>> >> > . >
On 2022/3/9 2:47, Yang Shi wrote: > On Tue, Mar 8, 2022 at 4:36 AM Miaohe Lin <linmiaohe@huawei.com> wrote: >> >> On 2022/3/8 3:53, Yang Shi wrote: >>> On Sun, Mar 6, 2022 at 11:07 PM Miaohe Lin <linmiaohe@huawei.com> wrote: >>>> >>>> On 2022/3/4 16:28, HORIGUCHI NAOYA(堀口 直也) wrote: >>>>> On Mon, Feb 28, 2022 at 10:02:45PM +0800, Miaohe Lin wrote: >>>>>> The huge zero page could reach here and if we ever try to split it, the >>>>>> VM_BUG_ON_PAGE will be triggered in split_huge_page_to_list(). Also the >>>>>> non-lru compound movable pages could be taken for transhuge pages. Skip >>>>>> these pages by checking PageLRU because huge zero page isn't lru page as >>>>>> non-lru compound movable pages. >>>>> >>>>> It seems that memory_failure() also fails at get_any_page() with "hwpoison: >>>>> unhandlable page" message. >>>>> >>>>> [16478.203474] page:00000000b6acdbd1 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1810b4 >>>>> [16478.206612] flags: 0x57ffffc0801000(reserved|hwpoison|node=1|zone=2|lastcpupid=0x1fffff) >>>>> [16478.209411] raw: 0057ffffc0801000 fffff11bc6042d08 fffff11bc6042d08 0000000000000000 >>>>> [16478.211921] raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000 >>>>> [16478.214473] page dumped because: hwpoison: unhandlable page >>>>> [16478.216386] Memory failure: 0x1810b4: recovery action for unknown page: Ignored >>>>> >>>>> We can't handle errors on huge (or normal) zero page, so the current >>>> >>>> Sorry for confusing commit log again. I should have a coffee before I make this patch. >>>> Huge or normal zero page will fail at get_any_page because they're neither HWPoisonHandlable >>>> nor PageHuge. >>>> >>>>> behavior seems to me more suitable than "unsplit thp". >>>>> >>>>> Or if you have some producer to reach the following path with huge zero >>>>> page, could you share it? >>>>> >>>> >>>> What I mean is that non-lru movable compound page can reach here unexpected because __PageMovable(page) >>>> is handleable now. So get_any_page could succeed to grab the page refcnt. And since it's compound page, >>>> it will go through the split_huge_page_to_list because PageTransHuge checks PageHead(page) which can also >>>> be true for compound page. But this type of pages is unexpected for split_huge_page_to_list. >>> >>> Can we really handle non-LRU movable pages in memory failure >>> (uncorrectable errors)? Typically they are balloon, zsmalloc, etc. >>> Assuming we run into a base (4K) non-LRU movable page, we could reach >>> as far as identify_page_state(), it should not fall into any category >>> except me_unknown. So it seems we could just simply make it >>> unhandlable. >> >> There is the comment from memory_failure: >> /* >> * We ignore non-LRU pages for good reasons. >> * - PG_locked is only well defined for LRU pages and a few others >> * - to avoid races with __SetPageLocked() >> * - to avoid races with __SetPageSlab*() (and more non-atomic ops) >> * The check (unnecessarily) ignores LRU pages being isolated and >> * walked by the page reclaim code, however that's not a big loss. >> */ >> >> So we could not handle non-LRU movable pages. >> >> What do you mean is something like below? >> >> diff --git a/mm/memory-failure.c b/mm/memory-failure.c >> index 5444a8ef4867..d80dbe0f20b6 100644 >> --- a/mm/memory-failure.c >> +++ b/mm/memory-failure.c >> @@ -1784,6 +1784,13 @@ int memory_failure(unsigned long pfn, int flags) >> } >> } >> >> + if (__PageMovable(hpage)) { >> + put_page(p); >> + action_result(pfn, MF_MSG_MOVALBE_PAGE, MF_IGNORED); >> + res = -EBUSY; >> + goto unlock_mutex; >> + } >> + >> if (PageTransHuge(hpage)) { >> /* >> * The flag must be set after the refcount is bumped >> >> >> i.e. Simply make non-LRU movable pages unhandlable ? > I think about the below code more carefully and I found that this will make hwpoison_filter can't handle the non-LRU movable pages now. Because non-LRU movable pages return early now and thus can't reach the hwpoison_filter. This results in a inconsistent behavior with previous one. So I think the origin fixup of this patch is more suitable. What do you think? Thanks. > I'd prefer this personally. Something like the below (compile test only): > > diff --git a/mm/memory-failure.c b/mm/memory-failure.c > index 5444a8ef4867..789e40909ade 100644 > --- a/mm/memory-failure.c > +++ b/mm/memory-failure.c > @@ -1176,12 +1176,18 @@ void ClearPageHWPoisonTakenOff(struct page *page) > * does not return true for hugetlb or device memory pages, so it's assumed > * to be called only in the context where we never have such pages. > */ > -static inline bool HWPoisonHandlable(struct page *page) > +static inline bool HWPoisonHandlable(struct page *page, unsigned long flags) > { > - return PageLRU(page) || __PageMovable(page) || is_free_buddy_page(page); > + bool movable = false; > + > + /* Soft offline could mirgate non-LRU movable pages */ > + if ((flags & MF_SOFT_OFFLINE) && __PageMovable(page)) > + movable = true; > + > + return movable || PageLRU(page) || is_free_buddy_page(page); > } > > -static int __get_hwpoison_page(struct page *page) > +static int __get_hwpoison_page(struct page *page, unsigned long flags) > { > struct page *head = compound_head(page); > int ret = 0; > @@ -1196,7 +1202,7 @@ static int __get_hwpoison_page(struct page *page) > * for any unsupported type of page in order to reduce the risk of > * unexpected races caused by taking a page refcount. > */ > - if (!HWPoisonHandlable(head)) > + if (!HWPoisonHandlable(head, flags)) > return -EBUSY; > > if (get_page_unless_zero(head)) { > @@ -1221,7 +1227,7 @@ static int get_any_page(struct page *p, unsigned > long flags) > > try_again: > if (!count_increased) { > - ret = __get_hwpoison_page(p); > + ret = __get_hwpoison_page(p, flags); > if (!ret) { > if (page_count(p)) { > /* We raced with an allocation, retry. */ > @@ -1249,7 +1255,7 @@ static int get_any_page(struct page *p, unsigned > long flags) > } > } > > - if (PageHuge(p) || HWPoisonHandlable(p)) { > + if (PageHuge(p) || HWPoisonHandlable(p, flags)) { > ret = 1; > } else { > /* > >> >>> >>> But it should be handlable for soft-offline since it could be migrated. >>> >> >> Yes, non-LRU movable pages can be simply migrated. >> >> Many thanks. >> >>> >>>> Does this make sense for you? Thanks Naoya. >>>> >>>>> Thanks, >>>>> Naoya Horiguchi >>>>> >>>>>> >>>>>> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> >>>>>> --- >>>>>> mm/memory-failure.c | 14 ++++++++++++++ >>>>>> 1 file changed, 14 insertions(+) >>>>>> >>>>>> diff --git a/mm/memory-failure.c b/mm/memory-failure.c >>>>>> index 23bfd809dc8c..ac6492e36978 100644 >>>>>> --- a/mm/memory-failure.c >>>>>> +++ b/mm/memory-failure.c >>>>>> @@ -1792,6 +1792,20 @@ int memory_failure(unsigned long pfn, int flags) >>>>>> } >>>>>> >>>>>> if (PageTransHuge(hpage)) { >>>>>> + /* >>>>>> + * The non-lru compound movable pages could be taken for >>>>>> + * transhuge pages. Also huge zero page could reach here >>>>>> + * and if we ever try to split it, the VM_BUG_ON_PAGE will >>>>>> + * be triggered in split_huge_page_to_list(). Skip these >>>>>> + * pages by checking PageLRU because huge zero page isn't >>>>>> + * lru page as non-lru compound movable pages. >>>>>> + */ >>>>>> + if (!PageLRU(hpage)) { >>>>>> + put_page(p); >>>>>> + action_result(pfn, MF_MSG_UNSPLIT_THP, MF_IGNORED); >>>>>> + res = -EBUSY; >>>>>> + goto unlock_mutex; >>>>>> + } >>>>>> /* >>>>>> * The flag must be set after the refcount is bumped >>>>>> * otherwise it may race with THP split. >>>>>> -- >>>>>> 2.23.0 >>>> >>>> >>> . >>> >> > . >
On Thu, Mar 10, 2022 at 3:46 AM Miaohe Lin <linmiaohe@huawei.com> wrote: > > On 2022/3/9 2:47, Yang Shi wrote: > > On Tue, Mar 8, 2022 at 4:36 AM Miaohe Lin <linmiaohe@huawei.com> wrote: > >> > >> On 2022/3/8 3:53, Yang Shi wrote: > >>> On Sun, Mar 6, 2022 at 11:07 PM Miaohe Lin <linmiaohe@huawei.com> wrote: > >>>> > >>>> On 2022/3/4 16:28, HORIGUCHI NAOYA(堀口 直也) wrote: > >>>>> On Mon, Feb 28, 2022 at 10:02:45PM +0800, Miaohe Lin wrote: > >>>>>> The huge zero page could reach here and if we ever try to split it, the > >>>>>> VM_BUG_ON_PAGE will be triggered in split_huge_page_to_list(). Also the > >>>>>> non-lru compound movable pages could be taken for transhuge pages. Skip > >>>>>> these pages by checking PageLRU because huge zero page isn't lru page as > >>>>>> non-lru compound movable pages. > >>>>> > >>>>> It seems that memory_failure() also fails at get_any_page() with "hwpoison: > >>>>> unhandlable page" message. > >>>>> > >>>>> [16478.203474] page:00000000b6acdbd1 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1810b4 > >>>>> [16478.206612] flags: 0x57ffffc0801000(reserved|hwpoison|node=1|zone=2|lastcpupid=0x1fffff) > >>>>> [16478.209411] raw: 0057ffffc0801000 fffff11bc6042d08 fffff11bc6042d08 0000000000000000 > >>>>> [16478.211921] raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000 > >>>>> [16478.214473] page dumped because: hwpoison: unhandlable page > >>>>> [16478.216386] Memory failure: 0x1810b4: recovery action for unknown page: Ignored > >>>>> > >>>>> We can't handle errors on huge (or normal) zero page, so the current > >>>> > >>>> Sorry for confusing commit log again. I should have a coffee before I make this patch. > >>>> Huge or normal zero page will fail at get_any_page because they're neither HWPoisonHandlable > >>>> nor PageHuge. > >>>> > >>>>> behavior seems to me more suitable than "unsplit thp". > >>>>> > >>>>> Or if you have some producer to reach the following path with huge zero > >>>>> page, could you share it? > >>>>> > >>>> > >>>> What I mean is that non-lru movable compound page can reach here unexpected because __PageMovable(page) > >>>> is handleable now. So get_any_page could succeed to grab the page refcnt. And since it's compound page, > >>>> it will go through the split_huge_page_to_list because PageTransHuge checks PageHead(page) which can also > >>>> be true for compound page. But this type of pages is unexpected for split_huge_page_to_list. > >>> > >>> Can we really handle non-LRU movable pages in memory failure > >>> (uncorrectable errors)? Typically they are balloon, zsmalloc, etc. > >>> Assuming we run into a base (4K) non-LRU movable page, we could reach > >>> as far as identify_page_state(), it should not fall into any category > >>> except me_unknown. So it seems we could just simply make it > >>> unhandlable. > >> > >> There is the comment from memory_failure: > >> /* > >> * We ignore non-LRU pages for good reasons. > >> * - PG_locked is only well defined for LRU pages and a few others > >> * - to avoid races with __SetPageLocked() > >> * - to avoid races with __SetPageSlab*() (and more non-atomic ops) > >> * The check (unnecessarily) ignores LRU pages being isolated and > >> * walked by the page reclaim code, however that's not a big loss. > >> */ > >> > >> So we could not handle non-LRU movable pages. > >> > >> What do you mean is something like below? > >> > >> diff --git a/mm/memory-failure.c b/mm/memory-failure.c > >> index 5444a8ef4867..d80dbe0f20b6 100644 > >> --- a/mm/memory-failure.c > >> +++ b/mm/memory-failure.c > >> @@ -1784,6 +1784,13 @@ int memory_failure(unsigned long pfn, int flags) > >> } > >> } > >> > >> + if (__PageMovable(hpage)) { > >> + put_page(p); > >> + action_result(pfn, MF_MSG_MOVALBE_PAGE, MF_IGNORED); > >> + res = -EBUSY; > >> + goto unlock_mutex; > >> + } > >> + > >> if (PageTransHuge(hpage)) { > >> /* > >> * The flag must be set after the refcount is bumped > >> > >> > >> i.e. Simply make non-LRU movable pages unhandlable ? > > > > I think about the below code more carefully and I found that this will make > hwpoison_filter can't handle the non-LRU movable pages now. Because non-LRU > movable pages return early now and thus can't reach the hwpoison_filter. This > results in a inconsistent behavior with previous one. So I think the origin > fixup of this patch is more suitable. What do you think? I'm not familiar with hwpoison_filter(), it seems like a test helper for error injection. I don't think hwpoison_filter() is used to filter unhandlable page, for example, slab page, IIUC. So the non-LRU movable pages should be treated the same. If so, the old behavior was simply wrong. > > Thanks. > > > I'd prefer this personally. Something like the below (compile test only): > > > > diff --git a/mm/memory-failure.c b/mm/memory-failure.c > > index 5444a8ef4867..789e40909ade 100644 > > --- a/mm/memory-failure.c > > +++ b/mm/memory-failure.c > > @@ -1176,12 +1176,18 @@ void ClearPageHWPoisonTakenOff(struct page *page) > > * does not return true for hugetlb or device memory pages, so it's assumed > > * to be called only in the context where we never have such pages. > > */ > > -static inline bool HWPoisonHandlable(struct page *page) > > +static inline bool HWPoisonHandlable(struct page *page, unsigned long flags) > > { > > - return PageLRU(page) || __PageMovable(page) || is_free_buddy_page(page); > > + bool movable = false; > > + > > + /* Soft offline could mirgate non-LRU movable pages */ > > + if ((flags & MF_SOFT_OFFLINE) && __PageMovable(page)) > > + movable = true; > > + > > + return movable || PageLRU(page) || is_free_buddy_page(page); > > } > > > > -static int __get_hwpoison_page(struct page *page) > > +static int __get_hwpoison_page(struct page *page, unsigned long flags) > > { > > struct page *head = compound_head(page); > > int ret = 0; > > @@ -1196,7 +1202,7 @@ static int __get_hwpoison_page(struct page *page) > > * for any unsupported type of page in order to reduce the risk of > > * unexpected races caused by taking a page refcount. > > */ > > - if (!HWPoisonHandlable(head)) > > + if (!HWPoisonHandlable(head, flags)) > > return -EBUSY; > > > > if (get_page_unless_zero(head)) { > > @@ -1221,7 +1227,7 @@ static int get_any_page(struct page *p, unsigned > > long flags) > > > > try_again: > > if (!count_increased) { > > - ret = __get_hwpoison_page(p); > > + ret = __get_hwpoison_page(p, flags); > > if (!ret) { > > if (page_count(p)) { > > /* We raced with an allocation, retry. */ > > @@ -1249,7 +1255,7 @@ static int get_any_page(struct page *p, unsigned > > long flags) > > } > > } > > > > - if (PageHuge(p) || HWPoisonHandlable(p)) { > > + if (PageHuge(p) || HWPoisonHandlable(p, flags)) { > > ret = 1; > > } else { > > /* > > > >> > >>> > >>> But it should be handlable for soft-offline since it could be migrated. > >>> > >> > >> Yes, non-LRU movable pages can be simply migrated. > >> > >> Many thanks. > >> > >>> > >>>> Does this make sense for you? Thanks Naoya. > >>>> > >>>>> Thanks, > >>>>> Naoya Horiguchi > >>>>> > >>>>>> > >>>>>> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> > >>>>>> --- > >>>>>> mm/memory-failure.c | 14 ++++++++++++++ > >>>>>> 1 file changed, 14 insertions(+) > >>>>>> > >>>>>> diff --git a/mm/memory-failure.c b/mm/memory-failure.c > >>>>>> index 23bfd809dc8c..ac6492e36978 100644 > >>>>>> --- a/mm/memory-failure.c > >>>>>> +++ b/mm/memory-failure.c > >>>>>> @@ -1792,6 +1792,20 @@ int memory_failure(unsigned long pfn, int flags) > >>>>>> } > >>>>>> > >>>>>> if (PageTransHuge(hpage)) { > >>>>>> + /* > >>>>>> + * The non-lru compound movable pages could be taken for > >>>>>> + * transhuge pages. Also huge zero page could reach here > >>>>>> + * and if we ever try to split it, the VM_BUG_ON_PAGE will > >>>>>> + * be triggered in split_huge_page_to_list(). Skip these > >>>>>> + * pages by checking PageLRU because huge zero page isn't > >>>>>> + * lru page as non-lru compound movable pages. > >>>>>> + */ > >>>>>> + if (!PageLRU(hpage)) { > >>>>>> + put_page(p); > >>>>>> + action_result(pfn, MF_MSG_UNSPLIT_THP, MF_IGNORED); > >>>>>> + res = -EBUSY; > >>>>>> + goto unlock_mutex; > >>>>>> + } > >>>>>> /* > >>>>>> * The flag must be set after the refcount is bumped > >>>>>> * otherwise it may race with THP split. > >>>>>> -- > >>>>>> 2.23.0 > >>>> > >>>> > >>> . > >>> > >> > > . > > > >
On 2022/3/11 3:32, Yang Shi wrote: > On Thu, Mar 10, 2022 at 3:46 AM Miaohe Lin <linmiaohe@huawei.com> wrote: >> >> On 2022/3/9 2:47, Yang Shi wrote: >>> On Tue, Mar 8, 2022 at 4:36 AM Miaohe Lin <linmiaohe@huawei.com> wrote: >>>> >>>> On 2022/3/8 3:53, Yang Shi wrote: >>>>> On Sun, Mar 6, 2022 at 11:07 PM Miaohe Lin <linmiaohe@huawei.com> wrote: >>>>>> >>>>>> On 2022/3/4 16:28, HORIGUCHI NAOYA(堀口 直也) wrote: >>>>>>> On Mon, Feb 28, 2022 at 10:02:45PM +0800, Miaohe Lin wrote: >>>>>>>> The huge zero page could reach here and if we ever try to split it, the >>>>>>>> VM_BUG_ON_PAGE will be triggered in split_huge_page_to_list(). Also the >>>>>>>> non-lru compound movable pages could be taken for transhuge pages. Skip >>>>>>>> these pages by checking PageLRU because huge zero page isn't lru page as >>>>>>>> non-lru compound movable pages. >>>>>>> >>>>>>> It seems that memory_failure() also fails at get_any_page() with "hwpoison: >>>>>>> unhandlable page" message. >>>>>>> >>>>>>> [16478.203474] page:00000000b6acdbd1 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1810b4 >>>>>>> [16478.206612] flags: 0x57ffffc0801000(reserved|hwpoison|node=1|zone=2|lastcpupid=0x1fffff) >>>>>>> [16478.209411] raw: 0057ffffc0801000 fffff11bc6042d08 fffff11bc6042d08 0000000000000000 >>>>>>> [16478.211921] raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000 >>>>>>> [16478.214473] page dumped because: hwpoison: unhandlable page >>>>>>> [16478.216386] Memory failure: 0x1810b4: recovery action for unknown page: Ignored >>>>>>> >>>>>>> We can't handle errors on huge (or normal) zero page, so the current >>>>>> >>>>>> Sorry for confusing commit log again. I should have a coffee before I make this patch. >>>>>> Huge or normal zero page will fail at get_any_page because they're neither HWPoisonHandlable >>>>>> nor PageHuge. >>>>>> >>>>>>> behavior seems to me more suitable than "unsplit thp". >>>>>>> >>>>>>> Or if you have some producer to reach the following path with huge zero >>>>>>> page, could you share it? >>>>>>> >>>>>> >>>>>> What I mean is that non-lru movable compound page can reach here unexpected because __PageMovable(page) >>>>>> is handleable now. So get_any_page could succeed to grab the page refcnt. And since it's compound page, >>>>>> it will go through the split_huge_page_to_list because PageTransHuge checks PageHead(page) which can also >>>>>> be true for compound page. But this type of pages is unexpected for split_huge_page_to_list. >>>>> >>>>> Can we really handle non-LRU movable pages in memory failure >>>>> (uncorrectable errors)? Typically they are balloon, zsmalloc, etc. >>>>> Assuming we run into a base (4K) non-LRU movable page, we could reach >>>>> as far as identify_page_state(), it should not fall into any category >>>>> except me_unknown. So it seems we could just simply make it >>>>> unhandlable. >>>> >>>> There is the comment from memory_failure: >>>> /* >>>> * We ignore non-LRU pages for good reasons. >>>> * - PG_locked is only well defined for LRU pages and a few others >>>> * - to avoid races with __SetPageLocked() >>>> * - to avoid races with __SetPageSlab*() (and more non-atomic ops) >>>> * The check (unnecessarily) ignores LRU pages being isolated and >>>> * walked by the page reclaim code, however that's not a big loss. >>>> */ >>>> >>>> So we could not handle non-LRU movable pages. >>>> >>>> What do you mean is something like below? >>>> >>>> diff --git a/mm/memory-failure.c b/mm/memory-failure.c >>>> index 5444a8ef4867..d80dbe0f20b6 100644 >>>> --- a/mm/memory-failure.c >>>> +++ b/mm/memory-failure.c >>>> @@ -1784,6 +1784,13 @@ int memory_failure(unsigned long pfn, int flags) >>>> } >>>> } >>>> >>>> + if (__PageMovable(hpage)) { >>>> + put_page(p); >>>> + action_result(pfn, MF_MSG_MOVALBE_PAGE, MF_IGNORED); >>>> + res = -EBUSY; >>>> + goto unlock_mutex; >>>> + } >>>> + >>>> if (PageTransHuge(hpage)) { >>>> /* >>>> * The flag must be set after the refcount is bumped >>>> >>>> >>>> i.e. Simply make non-LRU movable pages unhandlable ? >>> >> >> I think about the below code more carefully and I found that this will make >> hwpoison_filter can't handle the non-LRU movable pages now. Because non-LRU >> movable pages return early now and thus can't reach the hwpoison_filter. This >> results in a inconsistent behavior with previous one. So I think the origin >> fixup of this patch is more suitable. What do you think? > > I'm not familiar with hwpoison_filter(), it seems like a test helper > for error injection. I don't think hwpoison_filter() is used to filter > unhandlable page, for example, slab page, IIUC. So the non-LRU movable > pages should be treated the same. If so, the old behavior was simply > wrong. I think you're right. hwpoison_filter should filter the handleable error. Thanks. > >> >> Thanks. >> >>> I'd prefer this personally. Something like the below (compile test only): >>> >>> diff --git a/mm/memory-failure.c b/mm/memory-failure.c >>> index 5444a8ef4867..789e40909ade 100644 >>> --- a/mm/memory-failure.c >>> +++ b/mm/memory-failure.c >>> @@ -1176,12 +1176,18 @@ void ClearPageHWPoisonTakenOff(struct page *page) >>> * does not return true for hugetlb or device memory pages, so it's assumed >>> * to be called only in the context where we never have such pages. >>> */ >>> -static inline bool HWPoisonHandlable(struct page *page) >>> +static inline bool HWPoisonHandlable(struct page *page, unsigned long flags) >>> { >>> - return PageLRU(page) || __PageMovable(page) || is_free_buddy_page(page); >>> + bool movable = false; >>> + >>> + /* Soft offline could mirgate non-LRU movable pages */ >>> + if ((flags & MF_SOFT_OFFLINE) && __PageMovable(page)) >>> + movable = true; >>> + >>> + return movable || PageLRU(page) || is_free_buddy_page(page); >>> } >>> >>> -static int __get_hwpoison_page(struct page *page) >>> +static int __get_hwpoison_page(struct page *page, unsigned long flags) >>> { >>> struct page *head = compound_head(page); >>> int ret = 0; >>> @@ -1196,7 +1202,7 @@ static int __get_hwpoison_page(struct page *page) >>> * for any unsupported type of page in order to reduce the risk of >>> * unexpected races caused by taking a page refcount. >>> */ >>> - if (!HWPoisonHandlable(head)) >>> + if (!HWPoisonHandlable(head, flags)) >>> return -EBUSY; >>> >>> if (get_page_unless_zero(head)) { >>> @@ -1221,7 +1227,7 @@ static int get_any_page(struct page *p, unsigned >>> long flags) >>> >>> try_again: >>> if (!count_increased) { >>> - ret = __get_hwpoison_page(p); >>> + ret = __get_hwpoison_page(p, flags); >>> if (!ret) { >>> if (page_count(p)) { >>> /* We raced with an allocation, retry. */ >>> @@ -1249,7 +1255,7 @@ static int get_any_page(struct page *p, unsigned >>> long flags) >>> } >>> } >>> >>> - if (PageHuge(p) || HWPoisonHandlable(p)) { >>> + if (PageHuge(p) || HWPoisonHandlable(p, flags)) { >>> ret = 1; >>> } else { >>> /* >>> >>>> >>>>> >>>>> But it should be handlable for soft-offline since it could be migrated. >>>>> >>>> >>>> Yes, non-LRU movable pages can be simply migrated. >>>> >>>> Many thanks. >>>> >>>>> >>>>>> Does this make sense for you? Thanks Naoya. >>>>>> >>>>>>> Thanks, >>>>>>> Naoya Horiguchi >>>>>>> >>>>>>>> >>>>>>>> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> >>>>>>>> --- >>>>>>>> mm/memory-failure.c | 14 ++++++++++++++ >>>>>>>> 1 file changed, 14 insertions(+) >>>>>>>> >>>>>>>> diff --git a/mm/memory-failure.c b/mm/memory-failure.c >>>>>>>> index 23bfd809dc8c..ac6492e36978 100644 >>>>>>>> --- a/mm/memory-failure.c >>>>>>>> +++ b/mm/memory-failure.c >>>>>>>> @@ -1792,6 +1792,20 @@ int memory_failure(unsigned long pfn, int flags) >>>>>>>> } >>>>>>>> >>>>>>>> if (PageTransHuge(hpage)) { >>>>>>>> + /* >>>>>>>> + * The non-lru compound movable pages could be taken for >>>>>>>> + * transhuge pages. Also huge zero page could reach here >>>>>>>> + * and if we ever try to split it, the VM_BUG_ON_PAGE will >>>>>>>> + * be triggered in split_huge_page_to_list(). Skip these >>>>>>>> + * pages by checking PageLRU because huge zero page isn't >>>>>>>> + * lru page as non-lru compound movable pages. >>>>>>>> + */ >>>>>>>> + if (!PageLRU(hpage)) { >>>>>>>> + put_page(p); >>>>>>>> + action_result(pfn, MF_MSG_UNSPLIT_THP, MF_IGNORED); >>>>>>>> + res = -EBUSY; >>>>>>>> + goto unlock_mutex; >>>>>>>> + } >>>>>>>> /* >>>>>>>> * The flag must be set after the refcount is bumped >>>>>>>> * otherwise it may race with THP split. >>>>>>>> -- >>>>>>>> 2.23.0 >>>>>> >>>>>> >>>>> . >>>>> >>>> >>> . >>> >> >> > . >
diff --git a/mm/memory-failure.c b/mm/memory-failure.c index 23bfd809dc8c..ac6492e36978 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -1792,6 +1792,20 @@ int memory_failure(unsigned long pfn, int flags) } if (PageTransHuge(hpage)) { + /* + * The non-lru compound movable pages could be taken for + * transhuge pages. Also huge zero page could reach here + * and if we ever try to split it, the VM_BUG_ON_PAGE will + * be triggered in split_huge_page_to_list(). Skip these + * pages by checking PageLRU because huge zero page isn't + * lru page as non-lru compound movable pages. + */ + if (!PageLRU(hpage)) { + put_page(p); + action_result(pfn, MF_MSG_UNSPLIT_THP, MF_IGNORED); + res = -EBUSY; + goto unlock_mutex; + } /* * The flag must be set after the refcount is bumped * otherwise it may race with THP split.
The huge zero page could reach here and if we ever try to split it, the VM_BUG_ON_PAGE will be triggered in split_huge_page_to_list(). Also the non-lru compound movable pages could be taken for transhuge pages. Skip these pages by checking PageLRU because huge zero page isn't lru page as non-lru compound movable pages. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> --- mm/memory-failure.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+)