Message ID | 20230805110711.2975149-1-shikemeng@huaweicloud.com (mailing list archive) |
---|---|
Headers | show |
Series | Fixes and cleanups to compaction | expand |
On Sat, Aug 05, 2023 at 07:07:02PM +0800, Kemeng Shi wrote: > Hi all, this is another series to do fix and clean up to compaction. > Patch 1-2 fix and clean up freepage list operation. > Patch 3-4 fix and clean up isolation of freepages > Patch 7-9 factor code to check if compaction is needed for allocation > order. > More details can be found in respective patches. Thanks! As with your last patch series, half of the patches are missing. Looks like they didn't make it to lore.kernel.org either: https://lore.kernel.org/linux-mm/20230804110454.2935878-1-shikemeng@huaweicloud.com/ https://lore.kernel.org/linux-mm/20230805110711.2975149-1-shikemeng@huaweicloud.com/
Hi Matthew, thanks to inform me of this. You can find full series on lore.kernel.org all at [1] and [2]. I contacted owner-linux-mm@kvack.org and was told all patches are recived and forawrd successfully. Then I contacted meta@public-inbox.org which is the only email I found from help page of lore.kernel.org/linux-mm/, but no response yet. Please let me know if there is any other ways I can get help. Thanks! [1] https://lore.kernel.org/all/20230804110454.2935878-1-shikemeng@huaweicloud.com/ [2] https://lore.kernel.org/all/20230805110711.2975149-1-shikemeng@huaweicloud.com/ on 8/5/2023 11:14 AM, Matthew Wilcox wrote: > On Sat, Aug 05, 2023 at 07:07:02PM +0800, Kemeng Shi wrote: >> Hi all, this is another series to do fix and clean up to compaction. >> Patch 1-2 fix and clean up freepage list operation. >> Patch 3-4 fix and clean up isolation of freepages >> Patch 7-9 factor code to check if compaction is needed for allocation >> order. >> More details can be found in respective patches. Thanks! > > As with your last patch series, half of the patches are missing. > Looks like they didn't make it to lore.kernel.org either: > > https://lore.kernel.org/linux-mm/20230804110454.2935878-1-shikemeng@huaweicloud.com/ > https://lore.kernel.org/linux-mm/20230805110711.2975149-1-shikemeng@huaweicloud.com/ > >
On Sat, 5 Aug 2023 19:07:03 +0800 Kemeng Shi <shikemeng@huaweicloud.com> wrote: > The freepage is chained with buddy_list in freelist head. Use buddy_list > instead of lru to correct the list operation. > > ... > > --- a/mm/compaction.c > +++ b/mm/compaction.c > @@ -1395,8 +1395,8 @@ move_freelist_head(struct list_head *freelist, struct page *freepage) > { > LIST_HEAD(sublist); > > - if (!list_is_last(freelist, &freepage->lru)) { > - list_cut_before(&sublist, freelist, &freepage->lru); > + if (!list_is_last(freelist, &freepage->buddy_list)) { > + list_cut_before(&sublist, freelist, &freepage->buddy_list); > list_splice_tail(&sublist, freelist); > } > } > @@ -1412,8 +1412,8 @@ move_freelist_tail(struct list_head *freelist, struct page *freepage) > { > LIST_HEAD(sublist); > > - if (!list_is_first(freelist, &freepage->lru)) { > - list_cut_position(&sublist, freelist, &freepage->lru); > + if (!list_is_first(freelist, &freepage->buddy_list)) { > + list_cut_position(&sublist, freelist, &freepage->buddy_list); > list_splice_tail(&sublist, freelist); > } > } This looks like a significant error. Can we speculate about the possible runtime effects?
on 8/6/2023 1:11 AM, Andrew Morton wrote: > On Sat, 5 Aug 2023 19:07:03 +0800 Kemeng Shi <shikemeng@huaweicloud.com> wrote: > >> The freepage is chained with buddy_list in freelist head. Use buddy_list >> instead of lru to correct the list operation. >> >> ... >> >> --- a/mm/compaction.c >> +++ b/mm/compaction.c >> @@ -1395,8 +1395,8 @@ move_freelist_head(struct list_head *freelist, struct page *freepage) >> { >> LIST_HEAD(sublist); >> >> - if (!list_is_last(freelist, &freepage->lru)) { >> - list_cut_before(&sublist, freelist, &freepage->lru); >> + if (!list_is_last(freelist, &freepage->buddy_list)) { >> + list_cut_before(&sublist, freelist, &freepage->buddy_list); >> list_splice_tail(&sublist, freelist); >> } >> } >> @@ -1412,8 +1412,8 @@ move_freelist_tail(struct list_head *freelist, struct page *freepage) >> { >> LIST_HEAD(sublist); >> >> - if (!list_is_first(freelist, &freepage->lru)) { >> - list_cut_position(&sublist, freelist, &freepage->lru); >> + if (!list_is_first(freelist, &freepage->buddy_list)) { >> + list_cut_position(&sublist, freelist, &freepage->buddy_list); >> list_splice_tail(&sublist, freelist); >> } >> } > > This looks like a significant error. Can we speculate about the > possible runtime effects? > > > It seems no runtime effects for now as lru and buddy_list share the same memory address in a union.
On 8/5/2023 7:07 PM, Kemeng Shi wrote: > The freepage is chained with buddy_list in freelist head. Use buddy_list > instead of lru to correct the list operation. > > Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> > --- > mm/compaction.c | 8 ++++---- > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/mm/compaction.c b/mm/compaction.c > index ea61922a1619..513b1caeb4fa 100644 > --- a/mm/compaction.c > +++ b/mm/compaction.c > @@ -1395,8 +1395,8 @@ move_freelist_head(struct list_head *freelist, struct page *freepage) > { > LIST_HEAD(sublist); > > - if (!list_is_last(freelist, &freepage->lru)) { > - list_cut_before(&sublist, freelist, &freepage->lru); > + if (!list_is_last(freelist, &freepage->buddy_list)) { > + list_cut_before(&sublist, freelist, &freepage->buddy_list); > list_splice_tail(&sublist, freelist); > } > } > @@ -1412,8 +1412,8 @@ move_freelist_tail(struct list_head *freelist, struct page *freepage) > { > LIST_HEAD(sublist); > > - if (!list_is_first(freelist, &freepage->lru)) { > - list_cut_position(&sublist, freelist, &freepage->lru); > + if (!list_is_first(freelist, &freepage->buddy_list)) { > + list_cut_position(&sublist, freelist, &freepage->buddy_list); > list_splice_tail(&sublist, freelist); > } > }
On 8/5/2023 7:07 PM, Kemeng Shi wrote: > In strict mode, we should return 0 if there is any hole in pageblock. If > we successfully isolated pages at beginning at pageblock and then have a > bogus compound_order outside pageblock in next page. We will abort search > loop with blockpfn > end_pfn. Although we will limit blockpfn to end_pfn, > we will treat it as a successful isolation in strict mode as blockpfn is > not < end_pfn and return partial isolated pages. Then > isolate_freepages_range may success unexpectly with hole in isolated > range. Yes, that can be happened. > This patch also removes unnecessary limit for blockpfn to go outside > by buddy page introduced in fixed commit or by stride introduced after > fixed commit. Caller could use returned blockpfn to check if full > pageblock is scanned by test if blockpfn >= end and to get next pfn to > scan inside isolate_freepages_block on demand. IMO, I don't think removing the pageblock restriction is worth it, since it did not fix anything and will make people more confused, at least to me. That is to say, it will be surprised that the blockpfn can go outside of the pageblock after calling isolate_freepages_block() to just scan only one pageblock, and I did not see in detail if this can cause other potential problems. > Fixes: 9fcd6d2e052ee ("mm, compaction: skip compound pages by order in free scanner") > Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> > --- > mm/compaction.c | 21 +++++++++++---------- > 1 file changed, 11 insertions(+), 10 deletions(-) > > diff --git a/mm/compaction.c b/mm/compaction.c > index fa1b100b0d10..684f6e6cd8bc 100644 > --- a/mm/compaction.c > +++ b/mm/compaction.c > @@ -631,6 +631,14 @@ static unsigned long isolate_freepages_block(struct compact_control *cc, > page += (1UL << order) - 1; > nr_scanned += (1UL << order) - 1; > } > + /* > + * There is a tiny chance that we have read bogus > + * compound_order(), so be careful to not go outside > + * of the pageblock. > + */ > + if (unlikely(blockpfn >= end_pfn)) > + blockpfn = end_pfn - 1; So we can just add this validation to ensure that the isolate_freepages_block() can return 0 if failure is happened, which can fix your problem. > + > goto isolate_fail; > } > > @@ -677,17 +685,10 @@ static unsigned long isolate_freepages_block(struct compact_control *cc, > if (locked) > spin_unlock_irqrestore(&cc->zone->lock, flags); > > - /* > - * There is a tiny chance that we have read bogus compound_order(), > - * so be careful to not go outside of the pageblock. > - */ > - if (unlikely(blockpfn > end_pfn)) > - blockpfn = end_pfn; > - > trace_mm_compaction_isolate_freepages(*start_pfn, blockpfn, > nr_scanned, total_isolated); > > - /* Record how far we have got within the block */ > + /* Record how far we have got */ > *start_pfn = blockpfn; > > /* > @@ -1443,7 +1444,7 @@ fast_isolate_around(struct compact_control *cc, unsigned long pfn) > isolate_freepages_block(cc, &start_pfn, end_pfn, &cc->freepages, 1, false); > > /* Skip this pageblock in the future as it's full or nearly full */ > - if (start_pfn == end_pfn && !cc->no_set_skip_hint) > + if (start_pfn >= end_pfn && !cc->no_set_skip_hint) > set_pageblock_skip(page); > } > > @@ -1712,7 +1713,7 @@ static void isolate_freepages(struct compact_control *cc) > block_end_pfn, freelist, stride, false); > > /* Update the skip hint if the full pageblock was scanned */ > - if (isolate_start_pfn == block_end_pfn) > + if (isolate_start_pfn >= block_end_pfn) > update_pageblock_skip(cc, page, block_start_pfn - > pageblock_nr_pages); >
On 8/5/2023 7:07 PM, Kemeng Shi wrote: > We call isolate_freepages_block in strict mode, continuous pages in > pageblock will be isolated if isolate_freepages_block successed. > Then pfn + isolated will point to start of next pageblock to scan > no matter how many pageblocks are isolated in isolate_freepages_block. > Use pfn + isolated as start of next pageblock to scan to simplify the > iteration. IIUC, the isolate_freepages_block() can isolate high-order free pages, which means the pfn + isolated can be larger than the block_end_pfn. So in your patch, the 'block_start_pfn' and 'block_end_pfn' can be in different pageblocks, that will break pageblock_pfn_to_page(). > > Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> > --- > mm/compaction.c | 14 ++------------ > 1 file changed, 2 insertions(+), 12 deletions(-) > > diff --git a/mm/compaction.c b/mm/compaction.c > index 684f6e6cd8bc..8d7d38073d30 100644 > --- a/mm/compaction.c > +++ b/mm/compaction.c > @@ -733,21 +733,11 @@ isolate_freepages_range(struct compact_control *cc, > block_end_pfn = pageblock_end_pfn(pfn); > > for (; pfn < end_pfn; pfn += isolated, > - block_start_pfn = block_end_pfn, > - block_end_pfn += pageblock_nr_pages) { > + block_start_pfn = pfn, > + block_end_pfn = pfn + pageblock_nr_pages) { > /* Protect pfn from changing by isolate_freepages_block */ > unsigned long isolate_start_pfn = pfn; > > - /* > - * pfn could pass the block_end_pfn if isolated freepage > - * is more than pageblock order. In this case, we adjust > - * scanning range to right one. > - */ > - if (pfn >= block_end_pfn) { > - block_start_pfn = pageblock_start_pfn(pfn); > - block_end_pfn = pageblock_end_pfn(pfn); > - } > - > block_end_pfn = min(block_end_pfn, end_pfn); > > if (!pageblock_pfn_to_page(block_start_pfn,
on 8/15/2023 4:28 PM, Baolin Wang wrote: > > > On 8/5/2023 7:07 PM, Kemeng Shi wrote: >> In strict mode, we should return 0 if there is any hole in pageblock. If >> we successfully isolated pages at beginning at pageblock and then have a >> bogus compound_order outside pageblock in next page. We will abort search >> loop with blockpfn > end_pfn. Although we will limit blockpfn to end_pfn, >> we will treat it as a successful isolation in strict mode as blockpfn is >> not < end_pfn and return partial isolated pages. Then >> isolate_freepages_range may success unexpectly with hole in isolated >> range. > > Yes, that can be happened. > >> This patch also removes unnecessary limit for blockpfn to go outside >> by buddy page introduced in fixed commit or by stride introduced after >> fixed commit. Caller could use returned blockpfn to check if full >> pageblock is scanned by test if blockpfn >= end and to get next pfn to >> scan inside isolate_freepages_block on demand. > > IMO, I don't think removing the pageblock restriction is worth it, since it did not fix anything and will make people more confused, at least to me. > > That is to say, it will be surprised that the blockpfn can go outside of the pageblock after calling isolate_freepages_block() to just scan only one pageblock, and I did not see in detail if this can cause other potential problems. > >> Fixes: 9fcd6d2e052ee ("mm, compaction: skip compound pages by order in free scanner") >> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> >> --- >> mm/compaction.c | 21 +++++++++++---------- >> 1 file changed, 11 insertions(+), 10 deletions(-) >> >> diff --git a/mm/compaction.c b/mm/compaction.c >> index fa1b100b0d10..684f6e6cd8bc 100644 >> --- a/mm/compaction.c >> +++ b/mm/compaction.c >> @@ -631,6 +631,14 @@ static unsigned long isolate_freepages_block(struct compact_control *cc, >> page += (1UL << order) - 1; >> nr_scanned += (1UL << order) - 1; >> } >> + /* >> + * There is a tiny chance that we have read bogus >> + * compound_order(), so be careful to not go outside >> + * of the pageblock. >> + */ >> + if (unlikely(blockpfn >= end_pfn)) >> + blockpfn = end_pfn - 1; > > So we can just add this validation to ensure that the isolate_freepages_block() can return 0 if failure is happened, which can fix your problem. > Thanks for feedback! Sure, I will do this in next version. >> + >> goto isolate_fail; >> } >> @@ -677,17 +685,10 @@ static unsigned long isolate_freepages_block(struct compact_control *cc, >> if (locked) >> spin_unlock_irqrestore(&cc->zone->lock, flags); >> - /* >> - * There is a tiny chance that we have read bogus compound_order(), >> - * so be careful to not go outside of the pageblock. >> - */ >> - if (unlikely(blockpfn > end_pfn)) >> - blockpfn = end_pfn; >> - >> trace_mm_compaction_isolate_freepages(*start_pfn, blockpfn, >> nr_scanned, total_isolated); >> - /* Record how far we have got within the block */ >> + /* Record how far we have got */ >> *start_pfn = blockpfn; >> /* >> @@ -1443,7 +1444,7 @@ fast_isolate_around(struct compact_control *cc, unsigned long pfn) >> isolate_freepages_block(cc, &start_pfn, end_pfn, &cc->freepages, 1, false); >> /* Skip this pageblock in the future as it's full or nearly full */ >> - if (start_pfn == end_pfn && !cc->no_set_skip_hint) >> + if (start_pfn >= end_pfn && !cc->no_set_skip_hint) >> set_pageblock_skip(page); >> } >> @@ -1712,7 +1713,7 @@ static void isolate_freepages(struct compact_control *cc) >> block_end_pfn, freelist, stride, false); >> /* Update the skip hint if the full pageblock was scanned */ >> - if (isolate_start_pfn == block_end_pfn) >> + if (isolate_start_pfn >= block_end_pfn) >> update_pageblock_skip(cc, page, block_start_pfn - >> pageblock_nr_pages); >> >
on 8/15/2023 4:38 PM, Baolin Wang wrote: > > > On 8/5/2023 7:07 PM, Kemeng Shi wrote: >> We call isolate_freepages_block in strict mode, continuous pages in >> pageblock will be isolated if isolate_freepages_block successed. >> Then pfn + isolated will point to start of next pageblock to scan >> no matter how many pageblocks are isolated in isolate_freepages_block. >> Use pfn + isolated as start of next pageblock to scan to simplify the >> iteration. > > IIUC, the isolate_freepages_block() can isolate high-order free pages, which means the pfn + isolated can be larger than the block_end_pfn. So in your patch, the 'block_start_pfn' and 'block_end_pfn' can be in different pageblocks, that will break pageblock_pfn_to_page(). > In for update statement, we always update block_start_pfn to pfn and update block_end_pfn to pfn + pageblock_nr_pages. So they should point to the same pageblock. I guess you missed the change to update of block_end_pfn. :) >> >> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> >> --- >> mm/compaction.c | 14 ++------------ >> 1 file changed, 2 insertions(+), 12 deletions(-) >> >> diff --git a/mm/compaction.c b/mm/compaction.c >> index 684f6e6cd8bc..8d7d38073d30 100644 >> --- a/mm/compaction.c >> +++ b/mm/compaction.c >> @@ -733,21 +733,11 @@ isolate_freepages_range(struct compact_control *cc, >> block_end_pfn = pageblock_end_pfn(pfn); >> for (; pfn < end_pfn; pfn += isolated, >> - block_start_pfn = block_end_pfn, >> - block_end_pfn += pageblock_nr_pages) { >> + block_start_pfn = pfn, >> + block_end_pfn = pfn + pageblock_nr_pages) { >> /* Protect pfn from changing by isolate_freepages_block */ >> unsigned long isolate_start_pfn = pfn; >> - /* >> - * pfn could pass the block_end_pfn if isolated freepage >> - * is more than pageblock order. In this case, we adjust >> - * scanning range to right one. >> - */ >> - if (pfn >= block_end_pfn) { >> - block_start_pfn = pageblock_start_pfn(pfn); >> - block_end_pfn = pageblock_end_pfn(pfn); >> - } >> - >> block_end_pfn = min(block_end_pfn, end_pfn); >> if (!pageblock_pfn_to_page(block_start_pfn, >
On 8/15/2023 5:32 PM, Kemeng Shi wrote: > > > on 8/15/2023 4:38 PM, Baolin Wang wrote: >> >> >> On 8/5/2023 7:07 PM, Kemeng Shi wrote: >>> We call isolate_freepages_block in strict mode, continuous pages in >>> pageblock will be isolated if isolate_freepages_block successed. >>> Then pfn + isolated will point to start of next pageblock to scan >>> no matter how many pageblocks are isolated in isolate_freepages_block. >>> Use pfn + isolated as start of next pageblock to scan to simplify the >>> iteration. >> >> IIUC, the isolate_freepages_block() can isolate high-order free pages, which means the pfn + isolated can be larger than the block_end_pfn. So in your patch, the 'block_start_pfn' and 'block_end_pfn' can be in different pageblocks, that will break pageblock_pfn_to_page(). >> > In for update statement, we always update block_start_pfn to pfn and I mean, you changed to: 1) pfn += isolated; 2) block_start_pfn = pfn; 3) block_end_pfn = pfn + pageblock_nr_pages; But in 1) pfn + isolated can go outside of the currnet pageblock if isolating a high-order page, for example, located in the middle of the next pageblock. So that the block_start_pfn can point to the middle of the next pageblock, not the start position. Meanwhile after 3), the block_end_pfn can point another pageblock. Or I missed something else? > update block_end_pfn to pfn + pageblock_nr_pages. So they should point > to the same pageblock. I guess you missed the change to update of > block_end_pfn. :) >>> >>> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> >>> --- >>> mm/compaction.c | 14 ++------------ >>> 1 file changed, 2 insertions(+), 12 deletions(-) >>> >>> diff --git a/mm/compaction.c b/mm/compaction.c >>> index 684f6e6cd8bc..8d7d38073d30 100644 >>> --- a/mm/compaction.c >>> +++ b/mm/compaction.c >>> @@ -733,21 +733,11 @@ isolate_freepages_range(struct compact_control *cc, >>> block_end_pfn = pageblock_end_pfn(pfn); >>> for (; pfn < end_pfn; pfn += isolated, >>> - block_start_pfn = block_end_pfn, >>> - block_end_pfn += pageblock_nr_pages) { >>> + block_start_pfn = pfn, >>> + block_end_pfn = pfn + pageblock_nr_pages) { >>> /* Protect pfn from changing by isolate_freepages_block */ >>> unsigned long isolate_start_pfn = pfn; >>> - /* >>> - * pfn could pass the block_end_pfn if isolated freepage >>> - * is more than pageblock order. In this case, we adjust >>> - * scanning range to right one. >>> - */ >>> - if (pfn >= block_end_pfn) { >>> - block_start_pfn = pageblock_start_pfn(pfn); >>> - block_end_pfn = pageblock_end_pfn(pfn); >>> - } >>> - >>> block_end_pfn = min(block_end_pfn, end_pfn); >>> if (!pageblock_pfn_to_page(block_start_pfn, >>
on 8/15/2023 6:07 PM, Baolin Wang wrote: > > > On 8/15/2023 5:32 PM, Kemeng Shi wrote: >> >> >> on 8/15/2023 4:38 PM, Baolin Wang wrote: >>> >>> >>> On 8/5/2023 7:07 PM, Kemeng Shi wrote: >>>> We call isolate_freepages_block in strict mode, continuous pages in >>>> pageblock will be isolated if isolate_freepages_block successed. >>>> Then pfn + isolated will point to start of next pageblock to scan >>>> no matter how many pageblocks are isolated in isolate_freepages_block. >>>> Use pfn + isolated as start of next pageblock to scan to simplify the >>>> iteration. >>> >>> IIUC, the isolate_freepages_block() can isolate high-order free pages, which means the pfn + isolated can be larger than the block_end_pfn. So in your patch, the 'block_start_pfn' and 'block_end_pfn' can be in different pageblocks, that will break pageblock_pfn_to_page(). >>> >> In for update statement, we always update block_start_pfn to pfn and > > I mean, you changed to: > 1) pfn += isolated; > 2) block_start_pfn = pfn; > 3) block_end_pfn = pfn + pageblock_nr_pages; > > But in 1) pfn + isolated can go outside of the currnet pageblock if isolating a high-order page, for example, located in the middle of the next pageblock. So that the block_start_pfn can point to the middle of the next pageblock, not the start position. Meanwhile after 3), the block_end_pfn can point another pageblock. Or I missed something else? > Ah, I miss to explain this in changelog. In case we could we have buddy page with order higher than pageblock: 1. page in buddy page is aligned with it's order 2. order of page is higher than pageblock order Then page is aligned with pageblock order. So pfn of page and isolated pages count are both aligned pageblock order. So pfn + isolated is pageblock order aligned. >> update block_end_pfn to pfn + pageblock_nr_pages. So they should point >> to the same pageblock. I guess you missed the change to update of >> block_end_pfn. :) >>>> >>>> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> >>>> --- >>>> mm/compaction.c | 14 ++------------ >>>> 1 file changed, 2 insertions(+), 12 deletions(-) >>>> >>>> diff --git a/mm/compaction.c b/mm/compaction.c >>>> index 684f6e6cd8bc..8d7d38073d30 100644 >>>> --- a/mm/compaction.c >>>> +++ b/mm/compaction.c >>>> @@ -733,21 +733,11 @@ isolate_freepages_range(struct compact_control *cc, >>>> block_end_pfn = pageblock_end_pfn(pfn); >>>> for (; pfn < end_pfn; pfn += isolated, >>>> - block_start_pfn = block_end_pfn, >>>> - block_end_pfn += pageblock_nr_pages) { >>>> + block_start_pfn = pfn, >>>> + block_end_pfn = pfn + pageblock_nr_pages) { >>>> /* Protect pfn from changing by isolate_freepages_block */ >>>> unsigned long isolate_start_pfn = pfn; >>>> - /* >>>> - * pfn could pass the block_end_pfn if isolated freepage >>>> - * is more than pageblock order. In this case, we adjust >>>> - * scanning range to right one. >>>> - */ >>>> - if (pfn >= block_end_pfn) { >>>> - block_start_pfn = pageblock_start_pfn(pfn); >>>> - block_end_pfn = pageblock_end_pfn(pfn); >>>> - } >>>> - >>>> block_end_pfn = min(block_end_pfn, end_pfn); >>>> if (!pageblock_pfn_to_page(block_start_pfn, >>> >
On 8/15/2023 6:37 PM, Kemeng Shi wrote: > > > on 8/15/2023 6:07 PM, Baolin Wang wrote: >> >> >> On 8/15/2023 5:32 PM, Kemeng Shi wrote: >>> >>> >>> on 8/15/2023 4:38 PM, Baolin Wang wrote: >>>> >>>> >>>> On 8/5/2023 7:07 PM, Kemeng Shi wrote: >>>>> We call isolate_freepages_block in strict mode, continuous pages in >>>>> pageblock will be isolated if isolate_freepages_block successed. >>>>> Then pfn + isolated will point to start of next pageblock to scan >>>>> no matter how many pageblocks are isolated in isolate_freepages_block. >>>>> Use pfn + isolated as start of next pageblock to scan to simplify the >>>>> iteration. >>>> >>>> IIUC, the isolate_freepages_block() can isolate high-order free pages, which means the pfn + isolated can be larger than the block_end_pfn. So in your patch, the 'block_start_pfn' and 'block_end_pfn' can be in different pageblocks, that will break pageblock_pfn_to_page(). >>>> >>> In for update statement, we always update block_start_pfn to pfn and >> >> I mean, you changed to: >> 1) pfn += isolated; >> 2) block_start_pfn = pfn; >> 3) block_end_pfn = pfn + pageblock_nr_pages; >> >> But in 1) pfn + isolated can go outside of the currnet pageblock if isolating a high-order page, for example, located in the middle of the next pageblock. So that the block_start_pfn can point to the middle of the next pageblock, not the start position. Meanwhile after 3), the block_end_pfn can point another pageblock. Or I missed something else? >> > Ah, I miss to explain this in changelog. > In case we could we have buddy page with order higher than pageblock: > 1. page in buddy page is aligned with it's order > 2. order of page is higher than pageblock order > Then page is aligned with pageblock order. So pfn of page and isolated pages > count are both aligned pageblock order. So pfn + isolated is pageblock order > aligned. That's not what I mean. pfn + isolated is not always pageblock-aligned, since the isolate_freepages_block() can isolated high-order free pages (for example: order-1, order-2 ...). Suppose the pageblock size is 2M, when isolating a pageblock (suppose the pfn range is 0 - 511 to make the arithmetic easy) by isolate_freepages_block(), and suppose pfn 0 to pfn 510 are all order-0 page, but pfn 511 is order-1 page, so you will isolate 513 pages from this pageblock, which will make 'pfn + isolated' not pageblock aligned. >>> update block_end_pfn to pfn + pageblock_nr_pages. So they should point >>> to the same pageblock. I guess you missed the change to update of >>> block_end_pfn. :) >>>>> >>>>> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> >>>>> --- >>>>> mm/compaction.c | 14 ++------------ >>>>> 1 file changed, 2 insertions(+), 12 deletions(-) >>>>> >>>>> diff --git a/mm/compaction.c b/mm/compaction.c >>>>> index 684f6e6cd8bc..8d7d38073d30 100644 >>>>> --- a/mm/compaction.c >>>>> +++ b/mm/compaction.c >>>>> @@ -733,21 +733,11 @@ isolate_freepages_range(struct compact_control *cc, >>>>> block_end_pfn = pageblock_end_pfn(pfn); >>>>> for (; pfn < end_pfn; pfn += isolated, >>>>> - block_start_pfn = block_end_pfn, >>>>> - block_end_pfn += pageblock_nr_pages) { >>>>> + block_start_pfn = pfn, >>>>> + block_end_pfn = pfn + pageblock_nr_pages) { >>>>> /* Protect pfn from changing by isolate_freepages_block */ >>>>> unsigned long isolate_start_pfn = pfn; >>>>> - /* >>>>> - * pfn could pass the block_end_pfn if isolated freepage >>>>> - * is more than pageblock order. In this case, we adjust >>>>> - * scanning range to right one. >>>>> - */ >>>>> - if (pfn >= block_end_pfn) { >>>>> - block_start_pfn = pageblock_start_pfn(pfn); >>>>> - block_end_pfn = pageblock_end_pfn(pfn); >>>>> - } >>>>> - >>>>> block_end_pfn = min(block_end_pfn, end_pfn); >>>>> if (!pageblock_pfn_to_page(block_start_pfn, >>>> >>
on 8/19/2023 7:58 PM, Baolin Wang wrote: > > > On 8/15/2023 6:37 PM, Kemeng Shi wrote: >> >> >> on 8/15/2023 6:07 PM, Baolin Wang wrote: >>> >>> >>> On 8/15/2023 5:32 PM, Kemeng Shi wrote: >>>> >>>> >>>> on 8/15/2023 4:38 PM, Baolin Wang wrote: >>>>> >>>>> >>>>> On 8/5/2023 7:07 PM, Kemeng Shi wrote: >>>>>> We call isolate_freepages_block in strict mode, continuous pages in >>>>>> pageblock will be isolated if isolate_freepages_block successed. >>>>>> Then pfn + isolated will point to start of next pageblock to scan >>>>>> no matter how many pageblocks are isolated in isolate_freepages_block. >>>>>> Use pfn + isolated as start of next pageblock to scan to simplify the >>>>>> iteration. >>>>> >>>>> IIUC, the isolate_freepages_block() can isolate high-order free pages, which means the pfn + isolated can be larger than the block_end_pfn. So in your patch, the 'block_start_pfn' and 'block_end_pfn' can be in different pageblocks, that will break pageblock_pfn_to_page(). >>>>> >>>> In for update statement, we always update block_start_pfn to pfn and >>> >>> I mean, you changed to: >>> 1) pfn += isolated; >>> 2) block_start_pfn = pfn; >>> 3) block_end_pfn = pfn + pageblock_nr_pages; >>> >>> But in 1) pfn + isolated can go outside of the currnet pageblock if isolating a high-order page, for example, located in the middle of the next pageblock. So that the block_start_pfn can point to the middle of the next pageblock, not the start position. Meanwhile after 3), the block_end_pfn can point another pageblock. Or I missed something else? >>> >> Ah, I miss to explain this in changelog. >> In case we could we have buddy page with order higher than pageblock: >> 1. page in buddy page is aligned with it's order >> 2. order of page is higher than pageblock order >> Then page is aligned with pageblock order. So pfn of page and isolated pages >> count are both aligned pageblock order. So pfn + isolated is pageblock order >> aligned. > > That's not what I mean. pfn + isolated is not always pageblock-aligned, since the isolate_freepages_block() can isolated high-order free pages (for example: order-1, order-2 ...). > > Suppose the pageblock size is 2M, when isolating a pageblock (suppose the pfn range is 0 - 511 to make the arithmetic easy) by isolate_freepages_block(), and suppose pfn 0 to pfn 510 are all order-0 page, but pfn 511 is order-1 page, so you will isolate 513 pages from this pageblock, which will make 'pfn + isolated' not pageblock aligned. This is also no supposed to happen as low order buddy pages should never span cross boundary of high order pages: In buddy system, we always split order N pages into two order N - 1 pages as following: | order N | |order N - 1|order N - 1| So buddy pages with order N - 1 will never cross boudary of order N. Similar, buddy pages with order N - 2 will never cross boudary of order N - 1 and so on. Then any pages with order less than N will never cross boudary of order N. > >>>> update block_end_pfn to pfn + pageblock_nr_pages. So they should point >>>> to the same pageblock. I guess you missed the change to update of >>>> block_end_pfn. :) >>>>>> >>>>>> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> >>>>>> --- >>>>>> mm/compaction.c | 14 ++------------ >>>>>> 1 file changed, 2 insertions(+), 12 deletions(-) >>>>>> >>>>>> diff --git a/mm/compaction.c b/mm/compaction.c >>>>>> index 684f6e6cd8bc..8d7d38073d30 100644 >>>>>> --- a/mm/compaction.c >>>>>> +++ b/mm/compaction.c >>>>>> @@ -733,21 +733,11 @@ isolate_freepages_range(struct compact_control *cc, >>>>>> block_end_pfn = pageblock_end_pfn(pfn); >>>>>> for (; pfn < end_pfn; pfn += isolated, >>>>>> - block_start_pfn = block_end_pfn, >>>>>> - block_end_pfn += pageblock_nr_pages) { >>>>>> + block_start_pfn = pfn, >>>>>> + block_end_pfn = pfn + pageblock_nr_pages) { >>>>>> /* Protect pfn from changing by isolate_freepages_block */ >>>>>> unsigned long isolate_start_pfn = pfn; >>>>>> - /* >>>>>> - * pfn could pass the block_end_pfn if isolated freepage >>>>>> - * is more than pageblock order. In this case, we adjust >>>>>> - * scanning range to right one. >>>>>> - */ >>>>>> - if (pfn >= block_end_pfn) { >>>>>> - block_start_pfn = pageblock_start_pfn(pfn); >>>>>> - block_end_pfn = pageblock_end_pfn(pfn); >>>>>> - } >>>>>> - >>>>>> block_end_pfn = min(block_end_pfn, end_pfn); >>>>>> if (!pageblock_pfn_to_page(block_start_pfn, >>>>> >>> > >
On 8/22/2023 9:37 AM, Kemeng Shi wrote: > > > on 8/19/2023 7:58 PM, Baolin Wang wrote: >> >> >> On 8/15/2023 6:37 PM, Kemeng Shi wrote: >>> >>> >>> on 8/15/2023 6:07 PM, Baolin Wang wrote: >>>> >>>> >>>> On 8/15/2023 5:32 PM, Kemeng Shi wrote: >>>>> >>>>> >>>>> on 8/15/2023 4:38 PM, Baolin Wang wrote: >>>>>> >>>>>> >>>>>> On 8/5/2023 7:07 PM, Kemeng Shi wrote: >>>>>>> We call isolate_freepages_block in strict mode, continuous pages in >>>>>>> pageblock will be isolated if isolate_freepages_block successed. >>>>>>> Then pfn + isolated will point to start of next pageblock to scan >>>>>>> no matter how many pageblocks are isolated in isolate_freepages_block. >>>>>>> Use pfn + isolated as start of next pageblock to scan to simplify the >>>>>>> iteration. >>>>>> >>>>>> IIUC, the isolate_freepages_block() can isolate high-order free pages, which means the pfn + isolated can be larger than the block_end_pfn. So in your patch, the 'block_start_pfn' and 'block_end_pfn' can be in different pageblocks, that will break pageblock_pfn_to_page(). >>>>>> >>>>> In for update statement, we always update block_start_pfn to pfn and >>>> >>>> I mean, you changed to: >>>> 1) pfn += isolated; >>>> 2) block_start_pfn = pfn; >>>> 3) block_end_pfn = pfn + pageblock_nr_pages; >>>> >>>> But in 1) pfn + isolated can go outside of the currnet pageblock if isolating a high-order page, for example, located in the middle of the next pageblock. So that the block_start_pfn can point to the middle of the next pageblock, not the start position. Meanwhile after 3), the block_end_pfn can point another pageblock. Or I missed something else? >>>> >>> Ah, I miss to explain this in changelog. >>> In case we could we have buddy page with order higher than pageblock: >>> 1. page in buddy page is aligned with it's order >>> 2. order of page is higher than pageblock order >>> Then page is aligned with pageblock order. So pfn of page and isolated pages >>> count are both aligned pageblock order. So pfn + isolated is pageblock order >>> aligned. >> >> That's not what I mean. pfn + isolated is not always pageblock-aligned, since the isolate_freepages_block() can isolated high-order free pages (for example: order-1, order-2 ...). >> >> Suppose the pageblock size is 2M, when isolating a pageblock (suppose the pfn range is 0 - 511 to make the arithmetic easy) by isolate_freepages_block(), and suppose pfn 0 to pfn 510 are all order-0 page, but pfn 511 is order-1 page, so you will isolate 513 pages from this pageblock, which will make 'pfn + isolated' not pageblock aligned. I realized I made a bad example, sorry for noise. After more thinking, I agree that the 'pfn + isolated' is always pageblock aligned in strict mode. So feel free to add: Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> > This is also no supposed to happen as low order buddy pages should never span > cross boundary of high order pages: > In buddy system, we always split order N pages into two order N - 1 pages as > following: > | order N | > |order N - 1|order N - 1| > So buddy pages with order N - 1 will never cross boudary of order N. Similar, > buddy pages with order N - 2 will never cross boudary of order N - 1 and so > on. Then any pages with order less than N will never cross boudary of order > N. > >> >>>>> update block_end_pfn to pfn + pageblock_nr_pages. So they should point >>>>> to the same pageblock. I guess you missed the change to update of >>>>> block_end_pfn. :) >>>>>>> >>>>>>> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> >>>>>>> --- >>>>>>> mm/compaction.c | 14 ++------------ >>>>>>> 1 file changed, 2 insertions(+), 12 deletions(-) >>>>>>> >>>>>>> diff --git a/mm/compaction.c b/mm/compaction.c >>>>>>> index 684f6e6cd8bc..8d7d38073d30 100644 >>>>>>> --- a/mm/compaction.c >>>>>>> +++ b/mm/compaction.c >>>>>>> @@ -733,21 +733,11 @@ isolate_freepages_range(struct compact_control *cc, >>>>>>> block_end_pfn = pageblock_end_pfn(pfn); >>>>>>> for (; pfn < end_pfn; pfn += isolated, >>>>>>> - block_start_pfn = block_end_pfn, >>>>>>> - block_end_pfn += pageblock_nr_pages) { >>>>>>> + block_start_pfn = pfn, >>>>>>> + block_end_pfn = pfn + pageblock_nr_pages) { >>>>>>> /* Protect pfn from changing by isolate_freepages_block */ >>>>>>> unsigned long isolate_start_pfn = pfn; >>>>>>> - /* >>>>>>> - * pfn could pass the block_end_pfn if isolated freepage >>>>>>> - * is more than pageblock order. In this case, we adjust >>>>>>> - * scanning range to right one. >>>>>>> - */ >>>>>>> - if (pfn >= block_end_pfn) { >>>>>>> - block_start_pfn = pageblock_start_pfn(pfn); >>>>>>> - block_end_pfn = pageblock_end_pfn(pfn); >>>>>>> - } >>>>>>> - >>>>>>> block_end_pfn = min(block_end_pfn, end_pfn); >>>>>>> if (!pageblock_pfn_to_page(block_start_pfn, >>>>>> >>>> >> >>