Message ID | 20210526075247.11130-1-dinghui@sangfor.com.cn (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v3] mm/page_alloc: fix counting of free pages after take off from buddy | expand |
On 26.05.21 09:52, Ding Hui wrote: > Recently we found that there is a lot MemFree left in /proc/meminfo > after do a lot of pages soft offline, it's not quite correct. > > Before Oscar rework soft offline for free pages [1], if we soft > offline free pages, these pages are left in buddy with HWPoison > flag, and NR_FREE_PAGES is not updated immediately. So the difference > between NR_FREE_PAGES and real number of available free pages is > also even big at the beginning. > > However, with the workload running, when we catch HWPoison page in > any alloc functions subsequently, we will remove it from buddy, > meanwhile update the NR_FREE_PAGES and try again, so the NR_FREE_PAGES > will get more and more closer to the real number of available free pages. > (regardless of unpoison_memory()) > > Now, for offline free pages, after a successful call take_page_off_buddy(), > the page is no longer belong to buddy allocator, and will not be > used any more, but we missed accounting NR_FREE_PAGES in this situation, > and there is no chance to be updated later. > > Do update in take_page_off_buddy() like rmqueue() does, but avoid > double counting if some one already set_migratetype_isolate() on the > page. > > [1]: commit 06be6ff3d2ec ("mm,hwpoison: rework soft offline for free pages") > > Suggested-by: Naoya Horiguchi <naoya.horiguchi@nec.com> > Signed-off-by: Ding Hui <dinghui@sangfor.com.cn> > --- > v3: > - as Naoya Horiguchi suggested, do update only when > is_migrate_isolate(migratetype)) is false > - updated patch description > > v2: > - https://lore.kernel.org/linux-mm/20210508035533.23222-1-dinghui@sangfor.com.cn/ > - use __mod_zone_freepage_state instead of __mod_zone_page_state > > mm/page_alloc.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index aaa1655cf682..d1f5de1c1283 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -9158,6 +9158,8 @@ bool take_page_off_buddy(struct page *page) > del_page_from_free_list(page_head, zone, page_order); > break_down_buddy_pages(zone, page_head, page, 0, > page_order, migratetype); > + if (!is_migrate_isolate(migratetype)) > + __mod_zone_freepage_state(zone, -1, migratetype); > ret = true; > break; > } > I guess if we'd actually be removing a page from the buddy while it's currently isolated by someone else (i.e., alloc_contig_range()), we might be in bigger trouble. I think we should actually skip isolated pages completely. take_page_off_buddy() should not touch them. Anyhow, different problem, so Acked-by: David Hildenbrand <david@redhat.com>
On Wed, May 26, 2021 at 03:52:47PM +0800, Ding Hui wrote: > Recently we found that there is a lot MemFree left in /proc/meminfo > after do a lot of pages soft offline, it's not quite correct. > > Before Oscar rework soft offline for free pages [1], if we soft > offline free pages, these pages are left in buddy with HWPoison > flag, and NR_FREE_PAGES is not updated immediately. So the difference > between NR_FREE_PAGES and real number of available free pages is > also even big at the beginning. > > However, with the workload running, when we catch HWPoison page in > any alloc functions subsequently, we will remove it from buddy, > meanwhile update the NR_FREE_PAGES and try again, so the NR_FREE_PAGES > will get more and more closer to the real number of available free pages. > (regardless of unpoison_memory()) > > Now, for offline free pages, after a successful call take_page_off_buddy(), > the page is no longer belong to buddy allocator, and will not be > used any more, but we missed accounting NR_FREE_PAGES in this situation, > and there is no chance to be updated later. > > Do update in take_page_off_buddy() like rmqueue() does, but avoid > double counting if some one already set_migratetype_isolate() on the > page. > > [1]: commit 06be6ff3d2ec ("mm,hwpoison: rework soft offline for free pages") > > Suggested-by: Naoya Horiguchi <naoya.horiguchi@nec.com> > Signed-off-by: Ding Hui <dinghui@sangfor.com.cn> Reviewed-by: Oscar Salvador <osalvador@suse.de>
On Wed, May 26, 2021 at 09:58:15AM +0200, David Hildenbrand wrote: > I guess if we'd actually be removing a page from the buddy while it's > currently isolated by someone else (i.e., alloc_contig_range()), we might be > in bigger trouble. > > I think we should actually skip isolated pages completely. > take_page_off_buddy() should not touch them. That might be a problem indeed. I will have a look at it. Thanks
On Wed, May 26, 2021 at 03:52:47PM +0800, Ding Hui wrote: > Recently we found that there is a lot MemFree left in /proc/meminfo > after do a lot of pages soft offline, it's not quite correct. > > Before Oscar rework soft offline for free pages [1], if we soft > offline free pages, these pages are left in buddy with HWPoison > flag, and NR_FREE_PAGES is not updated immediately. So the difference > between NR_FREE_PAGES and real number of available free pages is > also even big at the beginning. > > However, with the workload running, when we catch HWPoison page in > any alloc functions subsequently, we will remove it from buddy, > meanwhile update the NR_FREE_PAGES and try again, so the NR_FREE_PAGES > will get more and more closer to the real number of available free pages. > (regardless of unpoison_memory()) > > Now, for offline free pages, after a successful call take_page_off_buddy(), > the page is no longer belong to buddy allocator, and will not be > used any more, but we missed accounting NR_FREE_PAGES in this situation, > and there is no chance to be updated later. > > Do update in take_page_off_buddy() like rmqueue() does, but avoid > double counting if some one already set_migratetype_isolate() on the > page. > > [1]: commit 06be6ff3d2ec ("mm,hwpoison: rework soft offline for free pages") > > Suggested-by: Naoya Horiguchi <naoya.horiguchi@nec.com> > Signed-off-by: Ding Hui <dinghui@sangfor.com.cn> Thank you very much. Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com> As for unpoison_memory(), I'm writing patches to fix unpoison (maybe takes a few weeks to be posted) and that will add a reverse operation of take_page_off_buddy() which simply calls __free_one_page(), so NR_FREE_PAGES counter will also be handled correctly with the patches.
diff --git a/mm/page_alloc.c b/mm/page_alloc.c index aaa1655cf682..d1f5de1c1283 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -9158,6 +9158,8 @@ bool take_page_off_buddy(struct page *page) del_page_from_free_list(page_head, zone, page_order); break_down_buddy_pages(zone, page_head, page, 0, page_order, migratetype); + if (!is_migrate_isolate(migratetype)) + __mod_zone_freepage_state(zone, -1, migratetype); ret = true; break; }
Recently we found that there is a lot MemFree left in /proc/meminfo after do a lot of pages soft offline, it's not quite correct. Before Oscar rework soft offline for free pages [1], if we soft offline free pages, these pages are left in buddy with HWPoison flag, and NR_FREE_PAGES is not updated immediately. So the difference between NR_FREE_PAGES and real number of available free pages is also even big at the beginning. However, with the workload running, when we catch HWPoison page in any alloc functions subsequently, we will remove it from buddy, meanwhile update the NR_FREE_PAGES and try again, so the NR_FREE_PAGES will get more and more closer to the real number of available free pages. (regardless of unpoison_memory()) Now, for offline free pages, after a successful call take_page_off_buddy(), the page is no longer belong to buddy allocator, and will not be used any more, but we missed accounting NR_FREE_PAGES in this situation, and there is no chance to be updated later. Do update in take_page_off_buddy() like rmqueue() does, but avoid double counting if some one already set_migratetype_isolate() on the page. [1]: commit 06be6ff3d2ec ("mm,hwpoison: rework soft offline for free pages") Suggested-by: Naoya Horiguchi <naoya.horiguchi@nec.com> Signed-off-by: Ding Hui <dinghui@sangfor.com.cn> --- v3: - as Naoya Horiguchi suggested, do update only when is_migrate_isolate(migratetype)) is false - updated patch description v2: - https://lore.kernel.org/linux-mm/20210508035533.23222-1-dinghui@sangfor.com.cn/ - use __mod_zone_freepage_state instead of __mod_zone_page_state mm/page_alloc.c | 2 ++ 1 file changed, 2 insertions(+)