Message ID | 20220216091431.39406-5-linmiaohe@huawei.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | A few cleanup and fixup patches for memory failure | expand |
On Wed, Feb 16, 2022 at 05:14:27PM +0800, Miaohe Lin wrote: > We're only intended to deal with the non-Compound page after we split thp > in memory_failure. However, the page could have changed compound pages due > to race window. If this happens, we could try again to hopefully handle the > page next round. Also remove unneeded orig_head. It's always equal to the > hpage. So we can use hpage directly and remove this redundant one. > > Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> > --- > mm/memory-failure.c | 20 ++++++++++++-------- > 1 file changed, 12 insertions(+), 8 deletions(-) > > diff --git a/mm/memory-failure.c b/mm/memory-failure.c > index 7e205d91b2d7..d66f642888be 100644 > --- a/mm/memory-failure.c > +++ b/mm/memory-failure.c > @@ -1690,7 +1690,6 @@ int memory_failure(unsigned long pfn, int flags) > { > struct page *p; > struct page *hpage; > - struct page *orig_head; > struct dev_pagemap *pgmap; > int res = 0; > unsigned long page_flags; > @@ -1736,7 +1735,7 @@ int memory_failure(unsigned long pfn, int flags) > goto unlock_mutex; > } > > - orig_head = hpage = compound_head(p); > + hpage = compound_head(p); > num_poisoned_pages_inc(); > > /* > @@ -1817,13 +1816,18 @@ int memory_failure(unsigned long pfn, int flags) > lock_page(p); > > /* > - * The page could have changed compound pages during the locking. > - * If this happens just bail out. > + * We're only intended to deal with the non-Compound page here. > + * However, the page could have changed compound pages due to > + * race window. If this happens, we could try again to hopefully > + * handle the page next round. > */ > - if (PageCompound(p) && compound_head(p) != orig_head) { > - action_result(pfn, MF_MSG_DIFFERENT_COMPOUND, MF_IGNORED); > - res = -EBUSY; > - goto unlock_page; > + if (PageCompound(p)) { > + if (TestClearPageHWPoison(p)) > + num_poisoned_pages_dec(); > + unlock_page(p); > + put_page(p); > + flags &= ~MF_COUNT_INCREASED; Could you limit the retry chance only once by using the local variable "retry"? It might be very rare to hit the race more than once in a single error event, but just to be safe from potential infinite loop (that could be opened by future changes). Thanks, Naoya Horiguchi > + goto try_again; > } > > /* > -- > 2.23.0
On 2022/2/18 9:13, HORIGUCHI NAOYA(堀口 直也) wrote: > On Wed, Feb 16, 2022 at 05:14:27PM +0800, Miaohe Lin wrote: >> We're only intended to deal with the non-Compound page after we split thp >> in memory_failure. However, the page could have changed compound pages due >> to race window. If this happens, we could try again to hopefully handle the >> page next round. Also remove unneeded orig_head. It's always equal to the >> hpage. So we can use hpage directly and remove this redundant one. >> >> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> >> --- >> mm/memory-failure.c | 20 ++++++++++++-------- >> 1 file changed, 12 insertions(+), 8 deletions(-) >> >> diff --git a/mm/memory-failure.c b/mm/memory-failure.c >> index 7e205d91b2d7..d66f642888be 100644 >> --- a/mm/memory-failure.c >> +++ b/mm/memory-failure.c >> @@ -1690,7 +1690,6 @@ int memory_failure(unsigned long pfn, int flags) >> { >> struct page *p; >> struct page *hpage; >> - struct page *orig_head; >> struct dev_pagemap *pgmap; >> int res = 0; >> unsigned long page_flags; >> @@ -1736,7 +1735,7 @@ int memory_failure(unsigned long pfn, int flags) >> goto unlock_mutex; >> } >> >> - orig_head = hpage = compound_head(p); >> + hpage = compound_head(p); >> num_poisoned_pages_inc(); >> >> /* >> @@ -1817,13 +1816,18 @@ int memory_failure(unsigned long pfn, int flags) >> lock_page(p); >> >> /* >> - * The page could have changed compound pages during the locking. >> - * If this happens just bail out. >> + * We're only intended to deal with the non-Compound page here. >> + * However, the page could have changed compound pages due to >> + * race window. If this happens, we could try again to hopefully >> + * handle the page next round. >> */ >> - if (PageCompound(p) && compound_head(p) != orig_head) { >> - action_result(pfn, MF_MSG_DIFFERENT_COMPOUND, MF_IGNORED); >> - res = -EBUSY; >> - goto unlock_page; >> + if (PageCompound(p)) { >> + if (TestClearPageHWPoison(p)) >> + num_poisoned_pages_dec(); >> + unlock_page(p); >> + put_page(p); >> + flags &= ~MF_COUNT_INCREASED; > > Could you limit the retry chance only once by using the local variable > "retry"? It might be very rare to hit the race more than once in a single > error event, but just to be safe from potential infinite loop (that could be > opened by future changes). > Sure. Will do it in V3. Thanks. > Thanks, > Naoya Horiguchi > >> + goto try_again; >> } >> >> /* >> -- >> 2.23.0
diff --git a/mm/memory-failure.c b/mm/memory-failure.c index 7e205d91b2d7..d66f642888be 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -1690,7 +1690,6 @@ int memory_failure(unsigned long pfn, int flags) { struct page *p; struct page *hpage; - struct page *orig_head; struct dev_pagemap *pgmap; int res = 0; unsigned long page_flags; @@ -1736,7 +1735,7 @@ int memory_failure(unsigned long pfn, int flags) goto unlock_mutex; } - orig_head = hpage = compound_head(p); + hpage = compound_head(p); num_poisoned_pages_inc(); /* @@ -1817,13 +1816,18 @@ int memory_failure(unsigned long pfn, int flags) lock_page(p); /* - * The page could have changed compound pages during the locking. - * If this happens just bail out. + * We're only intended to deal with the non-Compound page here. + * However, the page could have changed compound pages due to + * race window. If this happens, we could try again to hopefully + * handle the page next round. */ - if (PageCompound(p) && compound_head(p) != orig_head) { - action_result(pfn, MF_MSG_DIFFERENT_COMPOUND, MF_IGNORED); - res = -EBUSY; - goto unlock_page; + if (PageCompound(p)) { + if (TestClearPageHWPoison(p)) + num_poisoned_pages_dec(); + unlock_page(p); + put_page(p); + flags &= ~MF_COUNT_INCREASED; + goto try_again; } /*
We're only intended to deal with the non-Compound page after we split thp in memory_failure. However, the page could have changed compound pages due to race window. If this happens, we could try again to hopefully handle the page next round. Also remove unneeded orig_head. It's always equal to the hpage. So we can use hpage directly and remove this redundant one. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> --- mm/memory-failure.c | 20 ++++++++++++-------- 1 file changed, 12 insertions(+), 8 deletions(-)