Message ID | 56EC0C41.70503@suse.cz (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Am Freitag, den 18.03.2016, 15:10 +0100 schrieb Vlastimil Babka: > On 03/17/2016 04:52 PM, Joonsoo Kim wrote: > > 2016-03-18 0:43 GMT+09:00 Vlastimil Babka <vbabka@suse.cz>: > >>>>>> > >>>>>> Okay. I used following slightly optimized version and I need to > >>>>>> add 'max_order = min_t(unsigned int, MAX_ORDER, pageblock_order + 1)' > >>>>>> to yours. Please consider it, too. > >>>>> > >>>>> Hmm, this one is not work, I still can see the bug is there after > >>>>> applying > >>>>> this patch, did I miss something? > >>>> > >>>> I may find that there is a bug which was introduced by me some time > >>>> ago. Could you test following change in __free_one_page() on top of > >>>> Vlastimil's patch? > >>>> > >>>> -page_idx = pfn & ((1 << max_order) - 1); > >>>> +page_idx = pfn & ((1 << MAX_ORDER) - 1); > >>> > >>> > >>> I tested Vlastimil's patch + your change with stress for more than half > >>> hour, the bug > >>> I reported is gone :) > >> > >> > >> Oh, ok, will try to send proper patch, once I figure out what to write in > >> the changelog :) > > > > Thanks in advance! > > > > OK, here it is. Hanjun can you please retest this, as I'm not sure if you had > the same code due to the followup one-liner patches in the thread. Lucas, see if > it helps with your issue as well. Laura and Joonsoo, please also test and review > and check changelog if my perception of the problem is accurate :) > This doesn't help for my case, as it is still trying to merge pages in isolated ranges. It even tries extra hard at doing so. With concurrent isolation and frees going on this may lead to the start page of the range to be isolated merging into an higher order buddy page if it isn't already pageblock aligned, leading both test_pages_isolated and isolate_freepages to fail on an otherwise perfectly fine range. What I am arguing is that if a page is freed into an isolated range we should not try merge it with it's buddies at all, by setting max_order = order. If the range is isolated because want to isolate freepages from it, the work to do the merging is wasted, as isolate_freepages will split higher order pages into order-0 pages again. If we already finished isolating freepages and are in the process of undoing the isolation, we don't strictly need to do the merging in __free_one_page, but can defer it to unset_migratetype_isolate, allowing to simplify those code paths by disallowing any merging of isolated pages at all. Regards, Lucas
On 03/18/2016 03:42 PM, Lucas Stach wrote: > Am Freitag, den 18.03.2016, 15:10 +0100 schrieb Vlastimil Babka: >> On 03/17/2016 04:52 PM, Joonsoo Kim wrote: >> > 2016-03-18 0:43 GMT+09:00 Vlastimil Babka <vbabka@suse.cz>: >> >> OK, here it is. Hanjun can you please retest this, as I'm not sure if you had >> the same code due to the followup one-liner patches in the thread. Lucas, see if >> it helps with your issue as well. Laura and Joonsoo, please also test and review >> and check changelog if my perception of the problem is accurate :) >> > > This doesn't help for my case, as it is still trying to merge pages in > isolated ranges. It even tries extra hard at doing so. > > With concurrent isolation and frees going on this may lead to the start > page of the range to be isolated merging into an higher order buddy page > if it isn't already pageblock aligned, leading both test_pages_isolated > and isolate_freepages to fail on an otherwise perfectly fine range. > > What I am arguing is that if a page is freed into an isolated range we > should not try merge it with it's buddies at all, by setting max_order = > order. If the range is isolated because want to isolate freepages from > it, the work to do the merging is wasted, as isolate_freepages will > split higher order pages into order-0 pages again. > > If we already finished isolating freepages and are in the process of > undoing the isolation, we don't strictly need to do the merging in > __free_one_page, but can defer it to unset_migratetype_isolate, allowing > to simplify those code paths by disallowing any merging of isolated > pages at all. Oh, I think understand now. Yeah, skipping merging for pages in isolated pageblocks might be a rather elegant solution. But still, we would have to check buddy's migratetype at order >= pageblock_order like my patch does, which is annoying. Because even without isolated merging, the buddy might have already had order>=pageblock_order when it was isolated. So what if isolation also split existing buddies in the pageblock immediately when it sets the MIGRATETYPE_ISOLATE on the pageblock? Then we would have it guaranteed that there's no isolated buddy - a buddy candidate at order >= pageblock_order either has a smaller order (so it's not a buddy) or is not MIGRATE_ISOLATE so it's safe to merge with. Does that make sense?
On 2016/3/18 22:10, Vlastimil Babka wrote: > On 03/17/2016 04:52 PM, Joonsoo Kim wrote: >> 2016-03-18 0:43 GMT+09:00 Vlastimil Babka <vbabka@suse.cz>: >>>>>>> Okay. I used following slightly optimized version and I need to >>>>>>> add 'max_order = min_t(unsigned int, MAX_ORDER, pageblock_order + 1)' >>>>>>> to yours. Please consider it, too. >>>>>> Hmm, this one is not work, I still can see the bug is there after >>>>>> applying >>>>>> this patch, did I miss something? >>>>> I may find that there is a bug which was introduced by me some time >>>>> ago. Could you test following change in __free_one_page() on top of >>>>> Vlastimil's patch? >>>>> >>>>> -page_idx = pfn & ((1 << max_order) - 1); >>>>> +page_idx = pfn & ((1 << MAX_ORDER) - 1); >>>> >>>> I tested Vlastimil's patch + your change with stress for more than half >>>> hour, the bug >>>> I reported is gone :) >>> >>> Oh, ok, will try to send proper patch, once I figure out what to write in >>> the changelog :) >> Thanks in advance! >> > OK, here it is. Hanjun can you please retest this, as I'm not sure if you had I tested this new patch with stress for more than one hour, and it works! Since Lucas has comments on it, I'm willing to test further versions if needed. One minor comments below, > the same code due to the followup one-liner patches in the thread. Lucas, see if > it helps with your issue as well. Laura and Joonsoo, please also test and review > and check changelog if my perception of the problem is accurate :) > > Thanks > [...] > + if (max_order < MAX_ORDER) { > + /* If we are here, it means order is >= pageblock_order. > + * We want to prevent merge between freepages on isolate > + * pageblock and normal pageblock. Without this, pageblock > + * isolation could cause incorrect freepage or CMA accounting. > + * > + * We don't want to hit this code for the more frequent > + * low-order merging. > + */ > + if (unlikely(has_isolate_pageblock(zone))) { In the first version of your patch, it's + if (IS_ENABLED(CONFIG_CMA) && + unlikely(has_isolate_pageblock(zone))) { Why remove the IS_ENABLED(CONFIG_CMA) in the new version? Thanks Hanjun
On 03/19/2016 08:24 AM, Hanjun Guo wrote: > On 2016/3/18 22:10, Vlastimil Babka wrote: >>>> >>>> Oh, ok, will try to send proper patch, once I figure out what to write in >>>> the changelog :) >>> Thanks in advance! >>> >> OK, here it is. Hanjun can you please retest this, as I'm not sure if you had > > I tested this new patch with stress for more than one hour, and it works! That's good news, thanks! > Since Lucas has comments on it, I'm willing to test further versions if needed. > > One minor comments below, > >> the same code due to the followup one-liner patches in the thread. Lucas, see if >> it helps with your issue as well. Laura and Joonsoo, please also test and review >> and check changelog if my perception of the problem is accurate :) >> >> Thanks >> > [...] >> + if (max_order < MAX_ORDER) { >> + /* If we are here, it means order is >= pageblock_order. >> + * We want to prevent merge between freepages on isolate >> + * pageblock and normal pageblock. Without this, pageblock >> + * isolation could cause incorrect freepage or CMA accounting. >> + * >> + * We don't want to hit this code for the more frequent >> + * low-order merging. >> + */ >> + if (unlikely(has_isolate_pageblock(zone))) { > > In the first version of your patch, it's > > + if (IS_ENABLED(CONFIG_CMA) && > + unlikely(has_isolate_pageblock(zone))) { > > Why remove the IS_ENABLED(CONFIG_CMA) in the new version? Previously I thought the problem was CMA-specific, but after more detailed look I think it's not, as start_isolate_page_range() releases zone lock between pageblocks, so unexpected merging due to races can happen also between isolated and non-isolated non-CMA pageblocks. This function is called from memory hotplug code, and recently also alloc_contig_range() itself is outside CONFIG_CMA for allocating gigantic hugepages. Joonsoo's original commit 3c60509 was also not restricted to CMA, and same with his patch earlier in this thread. Hmm I guess another alternate solution would indeed be to modify start_isolate_page_range() and undo_isolate_page_range() to hold zone->lock across MAX_ORDER blocks (not whole requested range, as that could lead to hardlockups). But that still wouldn't help Lucas, IUUC. > Thanks > Hanjun > >
Am Freitag, den 18.03.2016, 21:58 +0100 schrieb Vlastimil Babka: > On 03/18/2016 03:42 PM, Lucas Stach wrote: > > Am Freitag, den 18.03.2016, 15:10 +0100 schrieb Vlastimil Babka: > >> On 03/17/2016 04:52 PM, Joonsoo Kim wrote: > >> > 2016-03-18 0:43 GMT+09:00 Vlastimil Babka <vbabka@suse.cz>: > >> > >> OK, here it is. Hanjun can you please retest this, as I'm not sure if you had > >> the same code due to the followup one-liner patches in the thread. Lucas, see if > >> it helps with your issue as well. Laura and Joonsoo, please also test and review > >> and check changelog if my perception of the problem is accurate :) > >> > > > > This doesn't help for my case, as it is still trying to merge pages in > > isolated ranges. It even tries extra hard at doing so. > > > > With concurrent isolation and frees going on this may lead to the start > > page of the range to be isolated merging into an higher order buddy page > > if it isn't already pageblock aligned, leading both test_pages_isolated > > and isolate_freepages to fail on an otherwise perfectly fine range. > > > > What I am arguing is that if a page is freed into an isolated range we > > should not try merge it with it's buddies at all, by setting max_order = > > order. If the range is isolated because want to isolate freepages from > > it, the work to do the merging is wasted, as isolate_freepages will > > split higher order pages into order-0 pages again. > > > > If we already finished isolating freepages and are in the process of > > undoing the isolation, we don't strictly need to do the merging in > > __free_one_page, but can defer it to unset_migratetype_isolate, allowing > > to simplify those code paths by disallowing any merging of isolated > > pages at all. > > Oh, I think understand now. Yeah, skipping merging for pages in isolated > pageblocks might be a rather elegant solution. But still, we would have to check > buddy's migratetype at order >= pageblock_order like my patch does, which is > annoying. Because even without isolated merging, the buddy might have already > had order>=pageblock_order when it was isolated. > So what if isolation also split existing buddies in the pageblock immediately > when it sets the MIGRATETYPE_ISOLATE on the pageblock? Then we would have it > guaranteed that there's no isolated buddy - a buddy candidate at order >= > pageblock_order either has a smaller order (so it's not a buddy) or is not > MIGRATE_ISOLATE so it's safe to merge with. > > Does that make sense? > This might increase the the overhead of isolation a lot. CMA is also used for small order allocations, so the work of splitting a whole pageblock to allocate a small number of pages out just to merge a lot of them again on unisolation might make this unattractive. My feeling is that checking the buddy migratetype for >=pageblock_order frees might be lower overhead, but I have no hard numbers to back this claim. Then on the other hand moving the work to isolation/unisolation affects only code paths that are expected to be quite slow anyways, doing the check in _free_one_page will affect everyone. Regards, Lucas
On Fri, Mar 18, 2016 at 03:10:09PM +0100, Vlastimil Babka wrote: > On 03/17/2016 04:52 PM, Joonsoo Kim wrote: > > 2016-03-18 0:43 GMT+09:00 Vlastimil Babka <vbabka@suse.cz>: > >>>>>> > >>>>>> Okay. I used following slightly optimized version and I need to > >>>>>> add 'max_order = min_t(unsigned int, MAX_ORDER, pageblock_order + 1)' > >>>>>> to yours. Please consider it, too. > >>>>> > >>>>> Hmm, this one is not work, I still can see the bug is there after > >>>>> applying > >>>>> this patch, did I miss something? > >>>> > >>>> I may find that there is a bug which was introduced by me some time > >>>> ago. Could you test following change in __free_one_page() on top of > >>>> Vlastimil's patch? > >>>> > >>>> -page_idx = pfn & ((1 << max_order) - 1); > >>>> +page_idx = pfn & ((1 << MAX_ORDER) - 1); > >>> > >>> > >>> I tested Vlastimil's patch + your change with stress for more than half > >>> hour, the bug > >>> I reported is gone :) > >> > >> > >> Oh, ok, will try to send proper patch, once I figure out what to write in > >> the changelog :) > > > > Thanks in advance! > > > > OK, here it is. Hanjun can you please retest this, as I'm not sure if you had > the same code due to the followup one-liner patches in the thread. Lucas, see if > it helps with your issue as well. Laura and Joonsoo, please also test and review > and check changelog if my perception of the problem is accurate :) > > Thanks > > ----8<---- > From: Vlastimil Babka <vbabka@suse.cz> > Date: Fri, 18 Mar 2016 14:22:31 +0100 > Subject: [PATCH] mm/page_alloc: prevent merging between isolated and other > pageblocks > > Hanjun Guo has reported that a CMA stress test causes broken accounting of > CMA and free pages: > > > Before the test, I got: > > -bash-4.3# cat /proc/meminfo | grep Cma > > CmaTotal: 204800 kB > > CmaFree: 195044 kB > > > > > > After running the test: > > -bash-4.3# cat /proc/meminfo | grep Cma > > CmaTotal: 204800 kB > > CmaFree: 6602584 kB > > > > So the freed CMA memory is more than total.. > > > > Also the the MemFree is more than mem total: > > > > -bash-4.3# cat /proc/meminfo > > MemTotal: 16342016 kB > > MemFree: 22367268 kB > > MemAvailable: 22370528 kB > > Laura Abbott has confirmed the issue and suspected the freepage accounting > rewrite around 3.18/4.0 by Joonsoo Kim. Joonsoo had a theory that this is > caused by unexpected merging between MIGRATE_ISOLATE and MIGRATE_CMA > pageblocks: > > > CMA isolates MAX_ORDER aligned blocks, but, during the process, > > partialy isolated block exists. If MAX_ORDER is 11 and > > pageblock_order is 9, two pageblocks make up MAX_ORDER > > aligned block and I can think following scenario because pageblock > > (un)isolation would be done one by one. > > > > (each character means one pageblock. 'C', 'I' means MIGRATE_CMA, > > MIGRATE_ISOLATE, respectively. > > > > CC -> IC -> II (Isolation) > > II -> CI -> CC (Un-isolation) > > > > If some pages are freed at this intermediate state such as IC or CI, > > that page could be merged to the other page that is resident on > > different type of pageblock and it will cause wrong freepage count. > > This was supposed to be prevented by CMA operating on MAX_ORDER blocks, but > since it doesn't hold the zone->lock between pageblocks, a race window does > exist. > > It's also likely that unexpected merging can occur between MIGRATE_ISOLATE > and non-CMA pageblocks. This should be prevented in __free_one_page() since > commit 3c605096d315 ("mm/page_alloc: restrict max order of merging on isolated > pageblock"). However, we only check the migratetype of the pageblock where > buddy merging has been initiated, not the migratetype of the buddy pageblock > (or group of pageblocks) which can be MIGRATE_ISOLATE. > > Joonsoo has suggested checking for buddy migratetype as part of > page_is_buddy(), but that would add extra checks in allocator hotpath and > bloat-o-meter has shown significant code bloat (the function is inline). > > This patch reduces the bloat at some expense of more complicated code. The > buddy-merging while-loop in __free_one_page() is initially bounded to > pageblock_border and without any migratetype checks. The checks are placed > outside, bumping the max_order if merging is allowed, and returning to the > while-loop with a statement which can't be possibly considered harmful. > > This fixes the accounting bug and also removes the arguably weird state in the > original commit 3c605096d315 where buddies could be left unmerged. > > Fixes: 3c605096d315 ("mm/page_alloc: restrict max order of merging on isolated pageblock") > Link: https://lkml.org/lkml/2016/3/2/280 > Reported-by: Hanjun Guo <guohanjun@huawei.com> > Debugged-by: Laura Abbott <labbott@redhat.com> > Debugged-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> > Signed-off-by: Vlastimil Babka <vbabka@suse.cz> > Cc: <stable@vger.kernel.org> # 3.18+ > --- > mm/page_alloc.c | 46 +++++++++++++++++++++++++++++++++------------- > 1 file changed, 33 insertions(+), 13 deletions(-) Acked-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Thanks for taking care of this issue!. Thanks.
On 03/23/2016 05:44 AM, Joonsoo Kim wrote: >> >> Fixes: 3c605096d315 ("mm/page_alloc: restrict max order of merging on isolated pageblock") >> Link: https://lkml.org/lkml/2016/3/2/280 >> Reported-by: Hanjun Guo <guohanjun@huawei.com> >> Debugged-by: Laura Abbott <labbott@redhat.com> >> Debugged-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> >> Signed-off-by: Vlastimil Babka <vbabka@suse.cz> >> Cc: <stable@vger.kernel.org> # 3.18+ >> --- >> mm/page_alloc.c | 46 +++++++++++++++++++++++++++++++++------------- >> 1 file changed, 33 insertions(+), 13 deletions(-) > > Acked-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> > > Thanks for taking care of this issue!. Thanks for the review. But I'm now not sure whether we push this to mainline+stable now, and later replace with Lucas' approach, or whether that approach would be also suitable and non-disruptive enough for stable?
2016-03-23 17:26 GMT+09:00 Vlastimil Babka <vbabka@suse.cz>: > On 03/23/2016 05:44 AM, Joonsoo Kim wrote: >>> >>> >>> Fixes: 3c605096d315 ("mm/page_alloc: restrict max order of merging on >>> isolated pageblock") >>> Link: https://lkml.org/lkml/2016/3/2/280 >>> Reported-by: Hanjun Guo <guohanjun@huawei.com> >>> Debugged-by: Laura Abbott <labbott@redhat.com> >>> Debugged-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> >>> Signed-off-by: Vlastimil Babka <vbabka@suse.cz> >>> Cc: <stable@vger.kernel.org> # 3.18+ >>> --- >>> mm/page_alloc.c | 46 +++++++++++++++++++++++++++++++++------------- >>> 1 file changed, 33 insertions(+), 13 deletions(-) >> >> >> Acked-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> >> >> Thanks for taking care of this issue!. > > > Thanks for the review. But I'm now not sure whether we push this to > mainline+stable now, and later replace with Lucas' approach, or whether that > approach would be also suitable and non-disruptive enough for stable? Lucas' approach is for improvement and would be complex rather than this. I don't think it would be appropriate for stable. IMO, it's better to push this to mainline + stable now. Thanks.
diff --git a/mm/page_alloc.c b/mm/page_alloc.c index c46b75d14b6f..112a5d5cec51 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -683,34 +683,28 @@ static inline void __free_one_page(struct page *page, unsigned long combined_idx; unsigned long uninitialized_var(buddy_idx); struct page *buddy; - unsigned int max_order = MAX_ORDER; + unsigned int max_order; + + max_order = min_t(unsigned int, MAX_ORDER, pageblock_order + 1); VM_BUG_ON(!zone_is_initialized(zone)); VM_BUG_ON_PAGE(page->flags & PAGE_FLAGS_CHECK_AT_PREP, page); VM_BUG_ON(migratetype == -1); - if (is_migrate_isolate(migratetype)) { - /* - * We restrict max order of merging to prevent merge - * between freepages on isolate pageblock and normal - * pageblock. Without this, pageblock isolation - * could cause incorrect freepage accounting. - */ - max_order = min_t(unsigned int, MAX_ORDER, pageblock_order + 1); - } else { + if (likely(!is_migrate_isolate(migratetype))) __mod_zone_freepage_state(zone, 1 << order, migratetype); - } - page_idx = pfn & ((1 << max_order) - 1); + page_idx = pfn & ((1 << MAX_ORDER) - 1); VM_BUG_ON_PAGE(page_idx & ((1 << order) - 1), page); VM_BUG_ON_PAGE(bad_range(zone, page), page); +continue_merging: while (order < max_order - 1) { buddy_idx = __find_buddy_index(page_idx, order); buddy = page + (buddy_idx - page_idx); if (!page_is_buddy(page, buddy, order)) - break; + goto done_merging; /* * Our buddy is free or it is CONFIG_DEBUG_PAGEALLOC guard page, * merge with it and move up one order. @@ -727,6 +721,32 @@ static inline void __free_one_page(struct page *page, page_idx = combined_idx; order++; } + if (max_order < MAX_ORDER) { + /* If we are here, it means order is >= pageblock_order. + * We want to prevent merge between freepages on isolate + * pageblock and normal pageblock. Without this, pageblock + * isolation could cause incorrect freepage or CMA accounting. + * + * We don't want to hit this code for the more frequent + * low-order merging. + */ + if (unlikely(has_isolate_pageblock(zone))) { + int buddy_mt; + + buddy_idx = __find_buddy_index(page_idx, order); + buddy = page + (buddy_idx - page_idx); + buddy_mt = get_pageblock_migratetype(buddy); + + if (migratetype != buddy_mt + && (is_migrate_isolate(migratetype) || + is_migrate_isolate(buddy_mt))) + goto done_merging; + } + max_order++; + goto continue_merging; + } + +done_merging: set_page_order(page, order); /*