Message ID | 20241102201621.95291-1-liuq131@chinatelecom.cn (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm/compaction: fix the total_isolated in strict mode | expand |
On Sat, 2 Nov 2024 20:16:21 +0000 Qiang Liu <liuq131@chinatelecom.cn> wrote: > If the last cycle reads bogus compound_order() and blockpfn > end_pfn occurs, > it is possible that total_isolated will be less than nr_scanned. In this case, > strict mode should return 0, but the “if (strict && blockpfn < end_pfn)” > statement cannot recognize this situation > > ... > > --- a/mm/compaction.c > +++ b/mm/compaction.c > @@ -699,7 +699,7 @@ static unsigned long isolate_freepages_block(struct compact_control *cc, > * pages requested were isolated. If there were any failures, 0 is > * returned and CMA will fail. > */ > - if (strict && blockpfn < end_pfn) > + if (strict && (blockpfn < end_pfn || total_isolated != nr_scanned)) > total_isolated = 0; > > cc->total_free_scanned += nr_scanned; That's really old code. What userspace-visible effects might this have? Is this from code inspection, or was some misbehaviour observed? Thanks.
On 2024/11/3 04:16, Qiang Liu wrote: > If the last cycle reads bogus compound_order() and blockpfn > end_pfn occurs, if blockpfn > end_pfn occurs, we will reset the blockpfn, right? /* * Be careful to not go outside of the pageblock. */ if (unlikely(blockpfn > end_pfn)) blockpfn = end_pfn; So how this can happen? > it is possible that total_isolated will be less than nr_scanned. In this case, > strict mode should return 0, but the “if (strict && blockpfn < end_pfn)” > statement cannot recognize this situation > > Signed-off-by: Qiang Liu <liuq131@chinatelecom.cn> > --- > mm/compaction.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/mm/compaction.c b/mm/compaction.c > index a2b16b08cbbf..6009f5d1021a 100644 > --- a/mm/compaction.c > +++ b/mm/compaction.c > @@ -699,7 +699,7 @@ static unsigned long isolate_freepages_block(struct compact_control *cc, > * pages requested were isolated. If there were any failures, 0 is > * returned and CMA will fail. > */ > - if (strict && blockpfn < end_pfn) > + if (strict && (blockpfn < end_pfn || total_isolated != nr_scanned)) > total_isolated = 0; > > cc->total_free_scanned += nr_scanned;
On 2024/11/12 10:16, liuq131@chinatelecom.cn wrote: > "We assume that the block we are currently processing is distributed as follows: > 0 1 2 511 > -------------------------------------------------- > | | | | > --------------------------------------------------- > Index 0 and 1 are both pages with an order of 0. > Index 2 has a bogus order (let's assume the order is 9). > When the for loop reaches index 2, it will enter the following code: > /* > * For compound pages such as THP and hugetlbfs, we can save > * potentially a lot of iterations if we skip them at once. > * The check is racy, but we can consider only valid values > * and the only danger is skipping too much. > */ > if (PageCompound(page)) { > const unsigned int order = compound_order(page); > if (blockpfn + (1UL << order) <= end_pfn) { > blockpfn += (1UL << order) - 1; > page += (1UL << order) - 1; > nr_scanned += (1UL << order) - 1; > } > goto isolate_fail; > } > > After exiting the for loop: > blockpfn =basepfn+ 2+2^9 = basepfn+514 > endpfn = basepfn +512 > total_isolated = 2 > nr_scanned = 514 In your case, the 'blockpfn' will not be updated to 'basepfn+514', because 'blockpfn + (1UL << order) > end_pfn', right? And remember the 'end_pfn' is the end of the pageblock. So I'm still confused about your case. Is this from code inspection? > /* > * Be careful to not go outside of the pageblock. > */ > if (unlikely(blockpfn > end_pfn)) > blockpfn = end_pfn; > > So this can happen > > /* > * If strict isolation is requested by CMA then check that all the > * pages requested were isolated. If there were any failures, 0 is > * returned and CMA will fail. > */ > if (strict && blockpfn < end_pfn) > total_isolated = 0; > > If processed according to the old code, it will not enter the if statement to reset total_isolated, but the correct handling is to reset total_isolated to 0. Please do not top-posting: " - Use interleaved ("inline") replies, which makes your response easier to read. (i.e. avoid top-posting -- the practice of putting your answer above the quoted text you are responding to.) For more details, see :ref:`Documentation/process/submitting-patches.rst <interleaved_replies>`. "
On 2024/11/14 10:10, Qiang Liu wrote: > From: Baolin Wang <baolin.wang@linux.alibaba.com> > > > On 2024/11/12 17:47, baolin.wang@linux.alibaba.com wrote: >> On 2024/11/12 10:16, liuq131@chinatelecom.cn wrote: >>> "We assume that the block we are currently processing is distributed >>> as follows: >>> 0 1 2 511 >>> -------------------------------------------------- >>> | | >>> | | >>> --------------------------------------------------- >>> Index 0 and 1 are both pages with an order of 0. >>> Index 2 has a bogus order (let's assume the order is 9). >>> When the for loop reaches index 2, it will enter the following code: >>> /* >>> * For compound pages such as THP and hugetlbfs, we can save >>> * potentially a lot of iterations if we skip them at once. >>> * The check is racy, but we can consider only valid values >>> * and the only danger is skipping too much. >>> */ >>> if (PageCompound(page)) { >>> const unsigned int order = compound_order(page); >>> if (blockpfn + (1UL << order) <= end_pfn) { >>> blockpfn += (1UL << order) - 1; >>> page += (1UL << order) - 1; >>> nr_scanned += (1UL << order) - 1; >>> } >>> goto isolate_fail; >>> } >>> >>> After exiting the for loop: >>> blockpfn =basepfn+ 2+2^9 = basepfn+514 >>> endpfn = basepfn +512 >>> total_isolated = 2 >>> nr_scanned = 514 >> >> In your case, the 'blockpfn' will not be updated to 'basepfn+514', >> because 'blockpfn + (1UL << order) > end_pfn', right? And remember the >> 'end_pfn' is the end of the pageblock. >> >> So I'm still confused about your case. Is this from code inspection? > You're right, the situation where blockpfn > end_pfn would not actually > occur here. > I encountered this issue in the 4.19 kernel, which did not have this check. > I didn't carefully examine this scenario later. Sorry about that. Never mind:) > However, when blockpfn == end_pfn, I believe the patch is still applicable, > but the git log needs to be updated. Is there still an opportunity to > submit > a revised version of the patch? Of course yes, and please describe your issue clearly in the next verion. However, IIUC when blockpfn == end_pfn in your case, the 'total_isolated' is still 0. >>> /* >>> * Be careful to not go outside of the pageblock. >>> */ >>> if (unlikely(blockpfn > end_pfn)) >>> blockpfn = end_pfn; >>> So this can happen >>> >>> /* >>> * If strict isolation is requested by CMA then check that all the >>> * pages requested were isolated. If there were any failures, 0 is >>> * returned and CMA will fail. >>> */ >>> if (strict && blockpfn < end_pfn) >>> total_isolated = 0; >>> >>> If processed according to the old code, it will not enter the if >>> statement to reset total_isolated, but the correct handling is to >>> reset total_isolated to 0. >> >> Please do not top-posting: >> >> " >> - Use interleaved ("inline") replies, which makes your response easier >> to read. (i.e. avoid top-posting -- the practice of putting your >> answer above the quoted text you are responding to.) For more details, >> see >> :ref:`Documentation/process/submitting-patches.rst >> <interleaved_replies>`. >> " >>
diff --git a/mm/compaction.c b/mm/compaction.c index a2b16b08cbbf..6009f5d1021a 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -699,7 +699,7 @@ static unsigned long isolate_freepages_block(struct compact_control *cc, * pages requested were isolated. If there were any failures, 0 is * returned and CMA will fail. */ - if (strict && blockpfn < end_pfn) + if (strict && (blockpfn < end_pfn || total_isolated != nr_scanned)) total_isolated = 0; cc->total_free_scanned += nr_scanned;
If the last cycle reads bogus compound_order() and blockpfn > end_pfn occurs, it is possible that total_isolated will be less than nr_scanned. In this case, strict mode should return 0, but the “if (strict && blockpfn < end_pfn)” statement cannot recognize this situation Signed-off-by: Qiang Liu <liuq131@chinatelecom.cn> --- mm/compaction.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)