Message ID | 20210904091839.20270-1-linmiaohe@huawei.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm/page_isolation: don't putback unisolated page | expand |
On 04.09.21 11:18, Miaohe Lin wrote: > If __isolate_free_page() failed, due to zone watermark check, the page is > still on the free list. But this page will be put back to free list again > via __putback_isolated_page() now. This may trigger page->flags checks in > __free_one_page() if PageReported is set. Or we will corrupt the free list > because list_add() will be called for pages already on another list. > > Fixes: 3c605096d315 ("mm/page_alloc: restrict max order of merging on isolated pageblock") > Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> > --- > mm/page_isolation.c | 6 ++---- > 1 file changed, 2 insertions(+), 4 deletions(-) > > diff --git a/mm/page_isolation.c b/mm/page_isolation.c > index 9bb562d5d194..7d70d772525c 100644 > --- a/mm/page_isolation.c > +++ b/mm/page_isolation.c > @@ -93,10 +93,8 @@ static void unset_migratetype_isolate(struct page *page, unsigned migratetype) > buddy_pfn = __find_buddy_pfn(pfn, order); > buddy = page + (buddy_pfn - pfn); > > - if (!is_migrate_isolate_page(buddy)) { > - __isolate_free_page(page, order); > - isolated_page = true; > - } > + if (!is_migrate_isolate_page(buddy)) > + isolated_page = !!__isolate_free_page(page, order); > } > } > > Shouldn't we much rather force to ignore watermarks here and make sure __isolate_free_page() never fails?
On 2021/9/6 17:40, David Hildenbrand wrote: > On 04.09.21 11:18, Miaohe Lin wrote: >> If __isolate_free_page() failed, due to zone watermark check, the page is >> still on the free list. But this page will be put back to free list again >> via __putback_isolated_page() now. This may trigger page->flags checks in >> __free_one_page() if PageReported is set. Or we will corrupt the free list >> because list_add() will be called for pages already on another list. >> >> Fixes: 3c605096d315 ("mm/page_alloc: restrict max order of merging on isolated pageblock") >> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> >> --- >> mm/page_isolation.c | 6 ++---- >> 1 file changed, 2 insertions(+), 4 deletions(-) >> >> diff --git a/mm/page_isolation.c b/mm/page_isolation.c >> index 9bb562d5d194..7d70d772525c 100644 >> --- a/mm/page_isolation.c >> +++ b/mm/page_isolation.c >> @@ -93,10 +93,8 @@ static void unset_migratetype_isolate(struct page *page, unsigned migratetype) >> buddy_pfn = __find_buddy_pfn(pfn, order); >> buddy = page + (buddy_pfn - pfn); >> - if (!is_migrate_isolate_page(buddy)) { >> - __isolate_free_page(page, order); >> - isolated_page = true; >> - } >> + if (!is_migrate_isolate_page(buddy)) >> + isolated_page = !!__isolate_free_page(page, order); >> } >> } >> > > Shouldn't we much rather force to ignore watermarks here and make sure __isolate_free_page() never fails? It seems it is not easy to force to ignore watermarks here. And it's not a problem if __isolate_free_page() fails because we can do move_freepages_block() anyway. What do you think? Many thanks. >
On 06.09.21 13:32, Miaohe Lin wrote: > On 2021/9/6 17:40, David Hildenbrand wrote: >> On 04.09.21 11:18, Miaohe Lin wrote: >>> If __isolate_free_page() failed, due to zone watermark check, the page is >>> still on the free list. But this page will be put back to free list again >>> via __putback_isolated_page() now. This may trigger page->flags checks in >>> __free_one_page() if PageReported is set. Or we will corrupt the free list >>> because list_add() will be called for pages already on another list. >>> >>> Fixes: 3c605096d315 ("mm/page_alloc: restrict max order of merging on isolated pageblock") >>> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> >>> --- >>> mm/page_isolation.c | 6 ++---- >>> 1 file changed, 2 insertions(+), 4 deletions(-) >>> >>> diff --git a/mm/page_isolation.c b/mm/page_isolation.c >>> index 9bb562d5d194..7d70d772525c 100644 >>> --- a/mm/page_isolation.c >>> +++ b/mm/page_isolation.c >>> @@ -93,10 +93,8 @@ static void unset_migratetype_isolate(struct page *page, unsigned migratetype) >>> buddy_pfn = __find_buddy_pfn(pfn, order); >>> buddy = page + (buddy_pfn - pfn); >>> - if (!is_migrate_isolate_page(buddy)) { >>> - __isolate_free_page(page, order); >>> - isolated_page = true; >>> - } >>> + if (!is_migrate_isolate_page(buddy)) >>> + isolated_page = !!__isolate_free_page(page, order); >>> } >>> } >>> >> >> Shouldn't we much rather force to ignore watermarks here and make sure __isolate_free_page() never fails? > > It seems it is not easy to force to ignore watermarks here. And it's not a problem > if __isolate_free_page() fails because we can do move_freepages_block() anyway. > What do you think? Many thanks. I'm wondering if all this complexity in this function is even required. What about something like this: (completely untested) diff --git a/mm/page_isolation.c b/mm/page_isolation.c index bddf788f45bf..29ff2fcb339c 100644 --- a/mm/page_isolation.c +++ b/mm/page_isolation.c @@ -66,12 +66,10 @@ static int set_migratetype_isolate(struct page *page, int migratetype, int isol_ static void unset_migratetype_isolate(struct page *page, unsigned migratetype) { + bool buddy_merge_possible = false; struct zone *zone; unsigned long flags, nr_pages; - bool isolated_page = false; unsigned int order; - unsigned long pfn, buddy_pfn; - struct page *buddy; zone = page_zone(page); spin_lock_irqsave(&zone->lock, flags); @@ -79,26 +77,15 @@ static void unset_migratetype_isolate(struct page *page, unsigned migratetype) goto out; /* - * Because freepage with more than pageblock_order on isolated - * pageblock is restricted to merge due to freepage counting problem, - * it is possible that there is free buddy page. - * move_freepages_block() doesn't care of merge so we need other - * approach in order to merge them. Isolation and free will make - * these pages to be merged. + * If our free page spans at least this whole pageblock and could + * eventually get merged into an even bigger page, go via + * __putback_isolated_page(), because move_freepages_block() won't + * trigger merging of free pages. */ if (PageBuddy(page)) { order = buddy_order(page); - if (order >= pageblock_order && order < MAX_ORDER - 1) { - pfn = page_to_pfn(page); - buddy_pfn = __find_buddy_pfn(pfn, order); - buddy = page + (buddy_pfn - pfn); - - if (pfn_valid_within(buddy_pfn) && - !is_migrate_isolate_page(buddy)) { - __isolate_free_page(page, order); - isolated_page = true; - } - } + if (order >= pageblock_order && order < MAX_ORDER - 1) + buddy_merge_possible = true; } /* @@ -111,12 +98,12 @@ static void unset_migratetype_isolate(struct page *page, unsigned migratetype) * onlining - just onlined memory won't immediately be considered for * allocation. */ - if (!isolated_page) { + if (!buddy_merge_possible) { nr_pages = move_freepages_block(zone, page, migratetype, NULL); __mod_zone_freepage_state(zone, nr_pages, migratetype); } set_pageblock_migratetype(page, migratetype); - if (isolated_page) + if (buddy_merge_possible) __putback_isolated_page(page, order, migratetype); zone->nr_isolate_pageblock--; out:
On 06.09.21 13:50, David Hildenbrand wrote: > On 06.09.21 13:32, Miaohe Lin wrote: >> On 2021/9/6 17:40, David Hildenbrand wrote: >>> On 04.09.21 11:18, Miaohe Lin wrote: >>>> If __isolate_free_page() failed, due to zone watermark check, the page is >>>> still on the free list. But this page will be put back to free list again >>>> via __putback_isolated_page() now. This may trigger page->flags checks in >>>> __free_one_page() if PageReported is set. Or we will corrupt the free list >>>> because list_add() will be called for pages already on another list. >>>> >>>> Fixes: 3c605096d315 ("mm/page_alloc: restrict max order of merging on isolated pageblock") >>>> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> >>>> --- >>>> mm/page_isolation.c | 6 ++---- >>>> 1 file changed, 2 insertions(+), 4 deletions(-) >>>> >>>> diff --git a/mm/page_isolation.c b/mm/page_isolation.c >>>> index 9bb562d5d194..7d70d772525c 100644 >>>> --- a/mm/page_isolation.c >>>> +++ b/mm/page_isolation.c >>>> @@ -93,10 +93,8 @@ static void unset_migratetype_isolate(struct page *page, unsigned migratetype) >>>> buddy_pfn = __find_buddy_pfn(pfn, order); >>>> buddy = page + (buddy_pfn - pfn); >>>> - if (!is_migrate_isolate_page(buddy)) { >>>> - __isolate_free_page(page, order); >>>> - isolated_page = true; >>>> - } >>>> + if (!is_migrate_isolate_page(buddy)) >>>> + isolated_page = !!__isolate_free_page(page, order); >>>> } >>>> } >>>> >>> >>> Shouldn't we much rather force to ignore watermarks here and make sure __isolate_free_page() never fails? >> >> It seems it is not easy to force to ignore watermarks here. And it's not a problem >> if __isolate_free_page() fails because we can do move_freepages_block() anyway. >> What do you think? Many thanks. > > I'm wondering if all this complexity in this function is even required. What about something like this: (completely untested) > > diff --git a/mm/page_isolation.c b/mm/page_isolation.c > index bddf788f45bf..29ff2fcb339c 100644 > --- a/mm/page_isolation.c > +++ b/mm/page_isolation.c > @@ -66,12 +66,10 @@ static int set_migratetype_isolate(struct page *page, int migratetype, int isol_ > > static void unset_migratetype_isolate(struct page *page, unsigned migratetype) > { > + bool buddy_merge_possible = false; > struct zone *zone; > unsigned long flags, nr_pages; > - bool isolated_page = false; > unsigned int order; > - unsigned long pfn, buddy_pfn; > - struct page *buddy; > > zone = page_zone(page); > spin_lock_irqsave(&zone->lock, flags); > @@ -79,26 +77,15 @@ static void unset_migratetype_isolate(struct page *page, unsigned migratetype) > goto out; > > /* > - * Because freepage with more than pageblock_order on isolated > - * pageblock is restricted to merge due to freepage counting problem, > - * it is possible that there is free buddy page. > - * move_freepages_block() doesn't care of merge so we need other > - * approach in order to merge them. Isolation and free will make > - * these pages to be merged. > + * If our free page spans at least this whole pageblock and could > + * eventually get merged into an even bigger page, go via > + * __putback_isolated_page(), because move_freepages_block() won't > + * trigger merging of free pages. > */ > if (PageBuddy(page)) { > order = buddy_order(page); > - if (order >= pageblock_order && order < MAX_ORDER - 1) { > - pfn = page_to_pfn(page); > - buddy_pfn = __find_buddy_pfn(pfn, order); > - buddy = page + (buddy_pfn - pfn); > - > - if (pfn_valid_within(buddy_pfn) && > - !is_migrate_isolate_page(buddy)) { > - __isolate_free_page(page, order); > - isolated_page = true; > - } > - } > + if (order >= pageblock_order && order < MAX_ORDER - 1) > + buddy_merge_possible = true; > } > > /* > @@ -111,12 +98,12 @@ static void unset_migratetype_isolate(struct page *page, unsigned migratetype) > * onlining - just onlined memory won't immediately be considered for > * allocation. > */ > - if (!isolated_page) { > + if (!buddy_merge_possible) { > nr_pages = move_freepages_block(zone, page, migratetype, NULL); > __mod_zone_freepage_state(zone, nr_pages, migratetype); > } > set_pageblock_migratetype(page, migratetype); > - if (isolated_page) > + if (buddy_merge_possible) > __putback_isolated_page(page, order, migratetype); > zone->nr_isolate_pageblock--; > out: Okay, I just had another look -- that won't work because as you correctly said, it still is on the freelist ... So your fix is certainly correct :)
On 04.09.21 11:18, Miaohe Lin wrote: > If __isolate_free_page() failed, due to zone watermark check, the page is > still on the free list. But this page will be put back to free list again > via __putback_isolated_page() now. This may trigger page->flags checks in > __free_one_page() if PageReported is set. Or we will corrupt the free list > because list_add() will be called for pages already on another list. > > Fixes: 3c605096d315 ("mm/page_alloc: restrict max order of merging on isolated pageblock") > Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> > --- > mm/page_isolation.c | 6 ++---- > 1 file changed, 2 insertions(+), 4 deletions(-) > > diff --git a/mm/page_isolation.c b/mm/page_isolation.c > index 9bb562d5d194..7d70d772525c 100644 > --- a/mm/page_isolation.c > +++ b/mm/page_isolation.c > @@ -93,10 +93,8 @@ static void unset_migratetype_isolate(struct page *page, unsigned migratetype) > buddy_pfn = __find_buddy_pfn(pfn, order); > buddy = page + (buddy_pfn - pfn); > > - if (!is_migrate_isolate_page(buddy)) { > - __isolate_free_page(page, order); > - isolated_page = true; > - } > + if (!is_migrate_isolate_page(buddy)) > + isolated_page = !!__isolate_free_page(page, order); > } > } > > Thanks! Reviewed-by: David Hildenbrand <david@redhat.com>
On 2021/9/6 20:01, David Hildenbrand wrote: > On 06.09.21 13:50, David Hildenbrand wrote: >> On 06.09.21 13:32, Miaohe Lin wrote: >>> On 2021/9/6 17:40, David Hildenbrand wrote: >>>> On 04.09.21 11:18, Miaohe Lin wrote: >>>>> If __isolate_free_page() failed, due to zone watermark check, the page is >>>>> still on the free list. But this page will be put back to free list again >>>>> via __putback_isolated_page() now. This may trigger page->flags checks in >>>>> __free_one_page() if PageReported is set. Or we will corrupt the free list >>>>> because list_add() will be called for pages already on another list. >>>>> >>>>> Fixes: 3c605096d315 ("mm/page_alloc: restrict max order of merging on isolated pageblock") >>>>> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> >>>>> --- >>>>> mm/page_isolation.c | 6 ++---- >>>>> 1 file changed, 2 insertions(+), 4 deletions(-) >>>>> >>>>> diff --git a/mm/page_isolation.c b/mm/page_isolation.c >>>>> index 9bb562d5d194..7d70d772525c 100644 >>>>> --- a/mm/page_isolation.c >>>>> +++ b/mm/page_isolation.c >>>>> @@ -93,10 +93,8 @@ static void unset_migratetype_isolate(struct page *page, unsigned migratetype) >>>>> buddy_pfn = __find_buddy_pfn(pfn, order); >>>>> buddy = page + (buddy_pfn - pfn); >>>>> - if (!is_migrate_isolate_page(buddy)) { >>>>> - __isolate_free_page(page, order); >>>>> - isolated_page = true; >>>>> - } >>>>> + if (!is_migrate_isolate_page(buddy)) >>>>> + isolated_page = !!__isolate_free_page(page, order); >>>>> } >>>>> } >>>>> >>>> >>>> Shouldn't we much rather force to ignore watermarks here and make sure __isolate_free_page() never fails? >>> >>> It seems it is not easy to force to ignore watermarks here. And it's not a problem >>> if __isolate_free_page() fails because we can do move_freepages_block() anyway. >>> What do you think? Many thanks. >> >> I'm wondering if all this complexity in this function is even required. What about something like this: (completely untested) >> >> diff --git a/mm/page_isolation.c b/mm/page_isolation.c >> index bddf788f45bf..29ff2fcb339c 100644 >> --- a/mm/page_isolation.c >> +++ b/mm/page_isolation.c >> @@ -66,12 +66,10 @@ static int set_migratetype_isolate(struct page *page, int migratetype, int isol_ >> static void unset_migratetype_isolate(struct page *page, unsigned migratetype) >> { >> + bool buddy_merge_possible = false; >> struct zone *zone; >> unsigned long flags, nr_pages; >> - bool isolated_page = false; >> unsigned int order; >> - unsigned long pfn, buddy_pfn; >> - struct page *buddy; >> zone = page_zone(page); >> spin_lock_irqsave(&zone->lock, flags); >> @@ -79,26 +77,15 @@ static void unset_migratetype_isolate(struct page *page, unsigned migratetype) >> goto out; >> /* >> - * Because freepage with more than pageblock_order on isolated >> - * pageblock is restricted to merge due to freepage counting problem, >> - * it is possible that there is free buddy page. >> - * move_freepages_block() doesn't care of merge so we need other >> - * approach in order to merge them. Isolation and free will make >> - * these pages to be merged. >> + * If our free page spans at least this whole pageblock and could >> + * eventually get merged into an even bigger page, go via >> + * __putback_isolated_page(), because move_freepages_block() won't >> + * trigger merging of free pages. >> */ >> if (PageBuddy(page)) { >> order = buddy_order(page); >> - if (order >= pageblock_order && order < MAX_ORDER - 1) { >> - pfn = page_to_pfn(page); >> - buddy_pfn = __find_buddy_pfn(pfn, order); >> - buddy = page + (buddy_pfn - pfn); >> - >> - if (pfn_valid_within(buddy_pfn) && >> - !is_migrate_isolate_page(buddy)) { >> - __isolate_free_page(page, order); >> - isolated_page = true; >> - } >> - } >> + if (order >= pageblock_order && order < MAX_ORDER - 1) >> + buddy_merge_possible = true; >> } >> /* >> @@ -111,12 +98,12 @@ static void unset_migratetype_isolate(struct page *page, unsigned migratetype) >> * onlining - just onlined memory won't immediately be considered for >> * allocation. >> */ >> - if (!isolated_page) { >> + if (!buddy_merge_possible) { >> nr_pages = move_freepages_block(zone, page, migratetype, NULL); >> __mod_zone_freepage_state(zone, nr_pages, migratetype); >> } >> set_pageblock_migratetype(page, migratetype); >> - if (isolated_page) >> + if (buddy_merge_possible) >> __putback_isolated_page(page, order, migratetype); >> zone->nr_isolate_pageblock--; >> out: > > Okay, I just had another look -- that won't work because as you correctly said, it still is on the freelist ... > > So your fix is certainly correct :) Many thanks for your effort and suggestion. :) >
On 06.09.21 14:02, David Hildenbrand wrote: > On 04.09.21 11:18, Miaohe Lin wrote: >> If __isolate_free_page() failed, due to zone watermark check, the page is >> still on the free list. But this page will be put back to free list again >> via __putback_isolated_page() now. This may trigger page->flags checks in >> __free_one_page() if PageReported is set. Or we will corrupt the free list >> because list_add() will be called for pages already on another list. >> >> Fixes: 3c605096d315 ("mm/page_alloc: restrict max order of merging on isolated pageblock") >> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> >> --- >> mm/page_isolation.c | 6 ++---- >> 1 file changed, 2 insertions(+), 4 deletions(-) >> >> diff --git a/mm/page_isolation.c b/mm/page_isolation.c >> index 9bb562d5d194..7d70d772525c 100644 >> --- a/mm/page_isolation.c >> +++ b/mm/page_isolation.c >> @@ -93,10 +93,8 @@ static void unset_migratetype_isolate(struct page *page, unsigned migratetype) >> buddy_pfn = __find_buddy_pfn(pfn, order); >> buddy = page + (buddy_pfn - pfn); >> >> - if (!is_migrate_isolate_page(buddy)) { >> - __isolate_free_page(page, order); >> - isolated_page = true; >> - } >> + if (!is_migrate_isolate_page(buddy)) >> + isolated_page = !!__isolate_free_page(page, order); >> } >> } >> >> > > Thanks! > > Reviewed-by: David Hildenbrand <david@redhat.com> > To make the confusion perfect (sorry) :D I tripple-checked: In unset_migratetype_isolate() we check that is_migrate_isolate_page(page) holds, otherwise we return. We call __isolate_free_page() only for such pages. __isolate_free_page() won't perform watermark checks on is_migrate_isolate(). Consequently, __isolate_free_page() should never fail when called from unset_migratetype_isolate() If that's correct then we could instead maybe add a VM_BUG_ON() and a comment why this can't fail. Makes sense or am I missing something?
On 2021/9/6 20:11, David Hildenbrand wrote: > On 06.09.21 14:02, David Hildenbrand wrote: >> On 04.09.21 11:18, Miaohe Lin wrote: >>> If __isolate_free_page() failed, due to zone watermark check, the page is >>> still on the free list. But this page will be put back to free list again >>> via __putback_isolated_page() now. This may trigger page->flags checks in >>> __free_one_page() if PageReported is set. Or we will corrupt the free list >>> because list_add() will be called for pages already on another list. >>> >>> Fixes: 3c605096d315 ("mm/page_alloc: restrict max order of merging on isolated pageblock") >>> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> >>> --- >>> mm/page_isolation.c | 6 ++---- >>> 1 file changed, 2 insertions(+), 4 deletions(-) >>> >>> diff --git a/mm/page_isolation.c b/mm/page_isolation.c >>> index 9bb562d5d194..7d70d772525c 100644 >>> --- a/mm/page_isolation.c >>> +++ b/mm/page_isolation.c >>> @@ -93,10 +93,8 @@ static void unset_migratetype_isolate(struct page *page, unsigned migratetype) >>> buddy_pfn = __find_buddy_pfn(pfn, order); >>> buddy = page + (buddy_pfn - pfn); >>> - if (!is_migrate_isolate_page(buddy)) { >>> - __isolate_free_page(page, order); >>> - isolated_page = true; >>> - } >>> + if (!is_migrate_isolate_page(buddy)) >>> + isolated_page = !!__isolate_free_page(page, order); >>> } >>> } >>> >> >> Thanks! >> >> Reviewed-by: David Hildenbrand <david@redhat.com> >> > > To make the confusion perfect (sorry) :D I tripple-checked: > > In unset_migratetype_isolate() we check that is_migrate_isolate_page(page) holds, otherwise we return. > > We call __isolate_free_page() only for such pages. > > __isolate_free_page() won't perform watermark checks on is_migrate_isolate(). > > Consequently, __isolate_free_page() should never fail when called from unset_migratetype_isolate() > > If that's correct then we could instead maybe add a VM_BUG_ON() and a comment why this can't fail. > > > Makes sense or am I missing something? I think you're right. __isolate_free_page() should never fail when called from unset_migratetype_isolate() as explained by you. But it might be too fragile to reply on the failure conditions of __isolate_free_page(). If that changes, VM_BUG_ON() here might trigger unexpectedly. Or am I just over-worried as failure conditions of __isolate_free_page() can hardly change? Many thanks. :) >
On 06.09.21 14:45, Miaohe Lin wrote: > On 2021/9/6 20:11, David Hildenbrand wrote: >> On 06.09.21 14:02, David Hildenbrand wrote: >>> On 04.09.21 11:18, Miaohe Lin wrote: >>>> If __isolate_free_page() failed, due to zone watermark check, the page is >>>> still on the free list. But this page will be put back to free list again >>>> via __putback_isolated_page() now. This may trigger page->flags checks in >>>> __free_one_page() if PageReported is set. Or we will corrupt the free list >>>> because list_add() will be called for pages already on another list. >>>> >>>> Fixes: 3c605096d315 ("mm/page_alloc: restrict max order of merging on isolated pageblock") >>>> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> >>>> --- >>>> mm/page_isolation.c | 6 ++---- >>>> 1 file changed, 2 insertions(+), 4 deletions(-) >>>> >>>> diff --git a/mm/page_isolation.c b/mm/page_isolation.c >>>> index 9bb562d5d194..7d70d772525c 100644 >>>> --- a/mm/page_isolation.c >>>> +++ b/mm/page_isolation.c >>>> @@ -93,10 +93,8 @@ static void unset_migratetype_isolate(struct page *page, unsigned migratetype) >>>> buddy_pfn = __find_buddy_pfn(pfn, order); >>>> buddy = page + (buddy_pfn - pfn); >>>> - if (!is_migrate_isolate_page(buddy)) { >>>> - __isolate_free_page(page, order); >>>> - isolated_page = true; >>>> - } >>>> + if (!is_migrate_isolate_page(buddy)) >>>> + isolated_page = !!__isolate_free_page(page, order); >>>> } >>>> } >>>> >>> >>> Thanks! >>> >>> Reviewed-by: David Hildenbrand <david@redhat.com> >>> >> >> To make the confusion perfect (sorry) :D I tripple-checked: >> >> In unset_migratetype_isolate() we check that is_migrate_isolate_page(page) holds, otherwise we return. >> >> We call __isolate_free_page() only for such pages. >> >> __isolate_free_page() won't perform watermark checks on is_migrate_isolate(). >> >> Consequently, __isolate_free_page() should never fail when called from unset_migratetype_isolate() >> >> If that's correct then we could instead maybe add a VM_BUG_ON() and a comment why this can't fail. >> >> >> Makes sense or am I missing something? > > I think you're right. __isolate_free_page() should never fail when called from unset_migratetype_isolate() > as explained by you. But it might be too fragile to reply on the failure conditions of __isolate_free_page(). > If that changes, VM_BUG_ON() here might trigger unexpectedly. Or am I just over-worried as failure conditions > of __isolate_free_page() can hardly change? Maybe isolated_page = !!__isolate_free_page(page, order); /* * Isolating a free page in an isolated pageblock is expected to always * work as watermarks don't apply here. */ VM_BUG_ON(isolated_page); VM_BUG_ON() allows us to detect any issues when testing. Combined with the comment it tells everybody messing with __isolate_free_page() what we expect in this function. In production system, we would handle it gracefully.
On 2021/9/6 20:49, David Hildenbrand wrote: > On 06.09.21 14:45, Miaohe Lin wrote: >> On 2021/9/6 20:11, David Hildenbrand wrote: >>> On 06.09.21 14:02, David Hildenbrand wrote: >>>> On 04.09.21 11:18, Miaohe Lin wrote: >>>>> If __isolate_free_page() failed, due to zone watermark check, the page is >>>>> still on the free list. But this page will be put back to free list again >>>>> via __putback_isolated_page() now. This may trigger page->flags checks in >>>>> __free_one_page() if PageReported is set. Or we will corrupt the free list >>>>> because list_add() will be called for pages already on another list. >>>>> >>>>> Fixes: 3c605096d315 ("mm/page_alloc: restrict max order of merging on isolated pageblock") >>>>> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> >>>>> --- >>>>> mm/page_isolation.c | 6 ++---- >>>>> 1 file changed, 2 insertions(+), 4 deletions(-) >>>>> >>>>> diff --git a/mm/page_isolation.c b/mm/page_isolation.c >>>>> index 9bb562d5d194..7d70d772525c 100644 >>>>> --- a/mm/page_isolation.c >>>>> +++ b/mm/page_isolation.c >>>>> @@ -93,10 +93,8 @@ static void unset_migratetype_isolate(struct page *page, unsigned migratetype) >>>>> buddy_pfn = __find_buddy_pfn(pfn, order); >>>>> buddy = page + (buddy_pfn - pfn); >>>>> - if (!is_migrate_isolate_page(buddy)) { >>>>> - __isolate_free_page(page, order); >>>>> - isolated_page = true; >>>>> - } >>>>> + if (!is_migrate_isolate_page(buddy)) >>>>> + isolated_page = !!__isolate_free_page(page, order); >>>>> } >>>>> } >>>>> >>>> >>>> Thanks! >>>> >>>> Reviewed-by: David Hildenbrand <david@redhat.com> >>>> >>> >>> To make the confusion perfect (sorry) :D I tripple-checked: >>> >>> In unset_migratetype_isolate() we check that is_migrate_isolate_page(page) holds, otherwise we return. >>> >>> We call __isolate_free_page() only for such pages. >>> >>> __isolate_free_page() won't perform watermark checks on is_migrate_isolate(). >>> >>> Consequently, __isolate_free_page() should never fail when called from unset_migratetype_isolate() >>> >>> If that's correct then we could instead maybe add a VM_BUG_ON() and a comment why this can't fail. >>> >>> >>> Makes sense or am I missing something? >> >> I think you're right. __isolate_free_page() should never fail when called from unset_migratetype_isolate() >> as explained by you. But it might be too fragile to reply on the failure conditions of __isolate_free_page(). >> If that changes, VM_BUG_ON() here might trigger unexpectedly. Or am I just over-worried as failure conditions >> of __isolate_free_page() can hardly change? > > Maybe > > isolated_page = !!__isolate_free_page(page, order); > /* > * Isolating a free page in an isolated pageblock is expected to always > * work as watermarks don't apply here. > */ > VM_BUG_ON(isolated_page); Should this be VM_BUG_ON(!isolated_page) ? > > > VM_BUG_ON() allows us to detect any issues when testing. Combined with the comment it tells everybody messing with __isolate_free_page() what we expect in this function. > > In production system, we would handle it gracefully. > Sounds reasonable. Will do it in v2. Many thanks for your suggestion and effort!
On 9/6/21 14:49, David Hildenbrand wrote: > On 06.09.21 14:45, Miaohe Lin wrote: >> On 2021/9/6 20:11, David Hildenbrand wrote: >>> On 06.09.21 14:02, David Hildenbrand wrote: >>>> On 04.09.21 11:18, Miaohe Lin wrote: >>>> >>>> Thanks! >>>> >>>> Reviewed-by: David Hildenbrand <david@redhat.com> >>>> >>> >>> To make the confusion perfect (sorry) :D I tripple-checked: >>> >>> In unset_migratetype_isolate() we check that is_migrate_isolate_page(page) holds, otherwise we return. >>> >>> We call __isolate_free_page() only for such pages. >>> >>> __isolate_free_page() won't perform watermark checks on is_migrate_isolate(). >>> >>> Consequently, __isolate_free_page() should never fail when called from unset_migratetype_isolate() >>> >>> If that's correct then we could instead maybe add a VM_BUG_ON() and a comment why this can't fail. >>> >>> >>> Makes sense or am I missing something? >> >> I think you're right. __isolate_free_page() should never fail when called from unset_migratetype_isolate() >> as explained by you. But it might be too fragile to reply on the failure conditions of __isolate_free_page(). >> If that changes, VM_BUG_ON() here might trigger unexpectedly. Or am I just over-worried as failure conditions >> of __isolate_free_page() can hardly change? > > Maybe > > isolated_page = !!__isolate_free_page(page, order); > /* > * Isolating a free page in an isolated pageblock is expected to always > * work as watermarks don't apply here. > */ > VM_BUG_ON(isolated_page); > > > VM_BUG_ON() allows us to detect any issues when testing. Combined with > the comment it tells everybody messing with __isolate_free_page() what > we expect in this function. > > In production system, we would handle it gracefully. If this can be handled gracefully, then I'd rather go with VM_WARN_ON. Maybe even WARN_ON_ONCE?
On 07.09.21 03:46, Miaohe Lin wrote: > On 2021/9/6 20:49, David Hildenbrand wrote: >> On 06.09.21 14:45, Miaohe Lin wrote: >>> On 2021/9/6 20:11, David Hildenbrand wrote: >>>> On 06.09.21 14:02, David Hildenbrand wrote: >>>>> On 04.09.21 11:18, Miaohe Lin wrote: >>>>>> If __isolate_free_page() failed, due to zone watermark check, the page is >>>>>> still on the free list. But this page will be put back to free list again >>>>>> via __putback_isolated_page() now. This may trigger page->flags checks in >>>>>> __free_one_page() if PageReported is set. Or we will corrupt the free list >>>>>> because list_add() will be called for pages already on another list. >>>>>> >>>>>> Fixes: 3c605096d315 ("mm/page_alloc: restrict max order of merging on isolated pageblock") >>>>>> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> >>>>>> --- >>>>>> mm/page_isolation.c | 6 ++---- >>>>>> 1 file changed, 2 insertions(+), 4 deletions(-) >>>>>> >>>>>> diff --git a/mm/page_isolation.c b/mm/page_isolation.c >>>>>> index 9bb562d5d194..7d70d772525c 100644 >>>>>> --- a/mm/page_isolation.c >>>>>> +++ b/mm/page_isolation.c >>>>>> @@ -93,10 +93,8 @@ static void unset_migratetype_isolate(struct page *page, unsigned migratetype) >>>>>> buddy_pfn = __find_buddy_pfn(pfn, order); >>>>>> buddy = page + (buddy_pfn - pfn); >>>>>> - if (!is_migrate_isolate_page(buddy)) { >>>>>> - __isolate_free_page(page, order); >>>>>> - isolated_page = true; >>>>>> - } >>>>>> + if (!is_migrate_isolate_page(buddy)) >>>>>> + isolated_page = !!__isolate_free_page(page, order); >>>>>> } >>>>>> } >>>>>> >>>>> >>>>> Thanks! >>>>> >>>>> Reviewed-by: David Hildenbrand <david@redhat.com> >>>>> >>>> >>>> To make the confusion perfect (sorry) :D I tripple-checked: >>>> >>>> In unset_migratetype_isolate() we check that is_migrate_isolate_page(page) holds, otherwise we return. >>>> >>>> We call __isolate_free_page() only for such pages. >>>> >>>> __isolate_free_page() won't perform watermark checks on is_migrate_isolate(). >>>> >>>> Consequently, __isolate_free_page() should never fail when called from unset_migratetype_isolate() >>>> >>>> If that's correct then we could instead maybe add a VM_BUG_ON() and a comment why this can't fail. >>>> >>>> >>>> Makes sense or am I missing something? >>> >>> I think you're right. __isolate_free_page() should never fail when called from unset_migratetype_isolate() >>> as explained by you. But it might be too fragile to reply on the failure conditions of __isolate_free_page(). >>> If that changes, VM_BUG_ON() here might trigger unexpectedly. Or am I just over-worried as failure conditions >>> of __isolate_free_page() can hardly change? >> >> Maybe >> >> isolated_page = !!__isolate_free_page(page, order); >> /* >> * Isolating a free page in an isolated pageblock is expected to always >> * work as watermarks don't apply here. >> */ >> VM_BUG_ON(isolated_page); > > Should this be VM_BUG_ON(!isolated_page) ? > Ehm, yes :)
On 07.09.21 10:08, Vlastimil Babka wrote: > On 9/6/21 14:49, David Hildenbrand wrote: >> On 06.09.21 14:45, Miaohe Lin wrote: >>> On 2021/9/6 20:11, David Hildenbrand wrote: >>>> On 06.09.21 14:02, David Hildenbrand wrote: >>>>> On 04.09.21 11:18, Miaohe Lin wrote: >>>>> >>>>> Thanks! >>>>> >>>>> Reviewed-by: David Hildenbrand <david@redhat.com> >>>>> >>>> >>>> To make the confusion perfect (sorry) :D I tripple-checked: >>>> >>>> In unset_migratetype_isolate() we check that is_migrate_isolate_page(page) holds, otherwise we return. >>>> >>>> We call __isolate_free_page() only for such pages. >>>> >>>> __isolate_free_page() won't perform watermark checks on is_migrate_isolate(). >>>> >>>> Consequently, __isolate_free_page() should never fail when called from unset_migratetype_isolate() >>>> >>>> If that's correct then we could instead maybe add a VM_BUG_ON() and a comment why this can't fail. >>>> >>>> >>>> Makes sense or am I missing something? >>> >>> I think you're right. __isolate_free_page() should never fail when called from unset_migratetype_isolate() >>> as explained by you. But it might be too fragile to reply on the failure conditions of __isolate_free_page(). >>> If that changes, VM_BUG_ON() here might trigger unexpectedly. Or am I just over-worried as failure conditions >>> of __isolate_free_page() can hardly change? >> >> Maybe >> >> isolated_page = !!__isolate_free_page(page, order); >> /* >> * Isolating a free page in an isolated pageblock is expected to always >> * work as watermarks don't apply here. >> */ >> VM_BUG_ON(isolated_page); >> >> >> VM_BUG_ON() allows us to detect any issues when testing. Combined with >> the comment it tells everybody messing with __isolate_free_page() what >> we expect in this function. >> >> In production system, we would handle it gracefully. > > If this can be handled gracefully, then I'd rather go with VM_WARN_ON. > Maybe even WARN_ON_ONCE? > I think either VM_BUG_ON() or VM_WARN_ON() -- compiling the runtime checks out -- should be good enough. I'd just go with VM_BUG_ON(), because anybody messing with __isolate_free_page() should clearly spot that we expect the current handling. But no strong opinion.
On 9/7/21 2:56 AM, David Hildenbrand wrote: ... >> If this can be handled gracefully, then I'd rather go with VM_WARN_ON. >> Maybe even WARN_ON_ONCE? >> > > I think either VM_BUG_ON() or VM_WARN_ON() -- compiling the runtime checks out -- should be good > enough. > > I'd just go with VM_BUG_ON(), because anybody messing with __isolate_free_page() should clearly spot > that we expect the current handling. But no strong opinion. > If in doubt, WARN*() should be preferred over BUG*(). There's a pretty long history of "don't kill the machine unless you have to" emails about this, let me dig up one...OK, maybe not the best example, but the tip of the iceberg: http://lkml.iu.edu/hypermail/linux/kernel/1610.0/00878.html thanks,
On 09.09.21 00:42, John Hubbard wrote: > On 9/7/21 2:56 AM, David Hildenbrand wrote: > ... >>> If this can be handled gracefully, then I'd rather go with VM_WARN_ON. >>> Maybe even WARN_ON_ONCE? >>> >> >> I think either VM_BUG_ON() or VM_WARN_ON() -- compiling the runtime checks out -- should be good >> enough. >> >> I'd just go with VM_BUG_ON(), because anybody messing with __isolate_free_page() should clearly spot >> that we expect the current handling. But no strong opinion. >> > > If in doubt, WARN*() should be preferred over BUG*(). There's a pretty long > history of "don't kill the machine unless you have to" emails about this, let > me dig up one...OK, maybe not the best example, but the tip of the iceberg: Please note the subtle difference between BUG_ON and VM_BUG_ON. We expect VM_BUG_ON to be compiled out on any production system. So it's really only a mean to identify things that really shouldn't be like that during debugging/testing. Using WARN... instead of VM_BUG_ON is even worse for production systems. There are distros that set panic_on_warn, essentially converting WARN... into BUG...
On 9/9/21 10:56, David Hildenbrand wrote: > On 09.09.21 00:42, John Hubbard wrote: >> On 9/7/21 2:56 AM, David Hildenbrand wrote: >> ... >>>> If this can be handled gracefully, then I'd rather go with VM_WARN_ON. >>>> Maybe even WARN_ON_ONCE? >>>> >>> >>> I think either VM_BUG_ON() or VM_WARN_ON() -- compiling the runtime >>> checks out -- should be good >>> enough. >>> >>> I'd just go with VM_BUG_ON(), because anybody messing with >>> __isolate_free_page() should clearly spot >>> that we expect the current handling. But no strong opinion. >>> >> >> If in doubt, WARN*() should be preferred over BUG*(). There's a pretty long >> history of "don't kill the machine unless you have to" emails about this, let >> me dig up one...OK, maybe not the best example, but the tip of the iceberg: > > Please note the subtle difference between BUG_ON and VM_BUG_ON. We expect > VM_BUG_ON to be compiled out on any production system. So it's really only a > mean to identify things that really shouldn't be like that during > debugging/testing. IIRC Fedora used to have CONFIG_DEBUG_VM enabled, did it change? > Using WARN... instead of VM_BUG_ON is even worse for production systems. > There are distros that set panic_on_warn, essentially converting WARN... > into BUG... Uh, does any distro really do that?
On 09.09.21 11:07, Vlastimil Babka wrote: > On 9/9/21 10:56, David Hildenbrand wrote: >> On 09.09.21 00:42, John Hubbard wrote: >>> On 9/7/21 2:56 AM, David Hildenbrand wrote: >>> ... >>>>> If this can be handled gracefully, then I'd rather go with VM_WARN_ON. >>>>> Maybe even WARN_ON_ONCE? >>>>> >>>> >>>> I think either VM_BUG_ON() or VM_WARN_ON() -- compiling the runtime >>>> checks out -- should be good >>>> enough. >>>> >>>> I'd just go with VM_BUG_ON(), because anybody messing with >>>> __isolate_free_page() should clearly spot >>>> that we expect the current handling. But no strong opinion. >>>> >>> >>> If in doubt, WARN*() should be preferred over BUG*(). There's a pretty long >>> history of "don't kill the machine unless you have to" emails about this, let >>> me dig up one...OK, maybe not the best example, but the tip of the iceberg: >> >> Please note the subtle difference between BUG_ON and VM_BUG_ON. We expect >> VM_BUG_ON to be compiled out on any production system. So it's really only a >> mean to identify things that really shouldn't be like that during >> debugging/testing. > > IIRC Fedora used to have CONFIG_DEBUG_VM enabled, did it change? Excellent question. Apparently you are right. Fortunately it's not a distro to use in production ;) In kernel-ark: redhat/configs/fedora/generic/CONFIG_DEBUG_VM:CONFIG_DEBUG_VM=y While for ARK (rhel-next so to say) redhat/configs/ark/generic/CONFIG_DEBUG_VM:# CONFIG_DEBUG_VM is not set So yes, the VM_WARN_ON would then be preferred in that case. But it's something that should never ever happen unless reviewers and developers really mess up, so I don't actually would sleep over that. We have other WARN... that can trigger more easily. > >> Using WARN... instead of VM_BUG_ON is even worse for production systems. >> There are distros that set panic_on_warn, essentially converting WARN... >> into BUG... > > Uh, does any distro really do that? Apparently, so I was told by Greg a year ago or so when wanting to add WARN_ON(). The advisory is to us pr_warn_once() instead. I rememebr it was a debian based distro.
On 9/9/21 1:56 AM, David Hildenbrand wrote: > On 09.09.21 00:42, John Hubbard wrote: >> On 9/7/21 2:56 AM, David Hildenbrand wrote: >> ... >>>> If this can be handled gracefully, then I'd rather go with VM_WARN_ON. >>>> Maybe even WARN_ON_ONCE? >>>> >>> >>> I think either VM_BUG_ON() or VM_WARN_ON() -- compiling the runtime checks out -- should be good >>> enough. >>> >>> I'd just go with VM_BUG_ON(), because anybody messing with __isolate_free_page() should clearly spot >>> that we expect the current handling. But no strong opinion. >>> >> >> If in doubt, WARN*() should be preferred over BUG*(). There's a pretty long >> history of "don't kill the machine unless you have to" emails about this, let >> me dig up one...OK, maybe not the best example, but the tip of the iceberg: > > Please note the subtle difference between BUG_ON and VM_BUG_ON. We expect VM_BUG_ON to be compiled > out on any production system. So it's really only a mean to identify things that really shouldn't be > like that during debugging/testing. > Yes, but the end result is the same: it halts the system. It don't think it changes the reasoning about BUG vs WARN very much. > Using WARN... instead of VM_BUG_ON is even worse for production systems. There are distros that set > panic_on_warn, essentially converting WARN... into BUG... > This is different than BUG: panic() *prints a backtrace*, and then reboots the system. That is still awkward, but a little more debuggable. thanks,
diff --git a/mm/page_isolation.c b/mm/page_isolation.c index 9bb562d5d194..7d70d772525c 100644 --- a/mm/page_isolation.c +++ b/mm/page_isolation.c @@ -93,10 +93,8 @@ static void unset_migratetype_isolate(struct page *page, unsigned migratetype) buddy_pfn = __find_buddy_pfn(pfn, order); buddy = page + (buddy_pfn - pfn); - if (!is_migrate_isolate_page(buddy)) { - __isolate_free_page(page, order); - isolated_page = true; - } + if (!is_migrate_isolate_page(buddy)) + isolated_page = !!__isolate_free_page(page, order); } }
If __isolate_free_page() failed, due to zone watermark check, the page is still on the free list. But this page will be put back to free list again via __putback_isolated_page() now. This may trigger page->flags checks in __free_one_page() if PageReported is set. Or we will corrupt the free list because list_add() will be called for pages already on another list. Fixes: 3c605096d315 ("mm/page_alloc: restrict max order of merging on isolated pageblock") Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> --- mm/page_isolation.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)