Message ID | 20240824010441.21308-2-21cnbao@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm: count the number of anonymous THPs per size | expand |
On 24.08.24 03:04, Barry Song wrote: > From: Barry Song <v-songbaohua@oppo.com> > > Let's track for each anonymous THP size, how many of them are currently > allocated. We'll track the complete lifespan of an anon THP, starting > when it becomes an anon THP ("large anon folio") (->mapping gets set), > until it gets freed (->mapping gets cleared). > > Introduce a new "nr_anon" counter per THP size and adjust the > corresponding counter in the following cases: > * We allocate a new THP and call folio_add_new_anon_rmap() to map > it the first time and turn it into an anon THP. > * We split an anon THP into multiple smaller ones. > * We migrate an anon THP, when we prepare the destination. > * We free an anon THP back to the buddy. > > Note that AnonPages in /proc/meminfo currently tracks the total number > of *mapped* anonymous *pages*, and therefore has slightly different > semantics. In the future, we might also want to track "nr_anon_mapped" > for each THP size, which might be helpful when comparing it to the > number of allocated anon THPs (long-term pinning, stuck in swapcache, > memory leaks, ...). > > Further note that for now, we only track anon THPs after they got their > ->mapping set, for example via folio_add_new_anon_rmap(). If we would > allocate some in the swapcache, they will only show up in the statistics > for now after they have been mapped to user space the first time, where > we call folio_add_new_anon_rmap(). > > Signed-off-by: Barry Song <v-songbaohua@oppo.com> > Acked-by: David Hildenbrand <david@redhat.com> > --- > Documentation/admin-guide/mm/transhuge.rst | 5 +++++ > include/linux/huge_mm.h | 15 +++++++++++++-- > mm/huge_memory.c | 13 ++++++++++--- > mm/migrate.c | 4 ++++ > mm/page_alloc.c | 5 ++++- > mm/rmap.c | 1 + > 6 files changed, 37 insertions(+), 6 deletions(-) > > diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst > index 79435c537e21..b78f2148b242 100644 > --- a/Documentation/admin-guide/mm/transhuge.rst > +++ b/Documentation/admin-guide/mm/transhuge.rst > @@ -551,6 +551,11 @@ split_deferred > it would free up some memory. Pages on split queue are going to > be split under memory pressure, if splitting is possible. > In light of documentation of patch #2, small nits: > +nr_anon > + the number of transparent anon huge pages we have in the whole system. s/transparent anon huge pages/anonymous THP/ > + These huge pages could be entirely mapped or have partially s/huge pages/THPs/ s/could be/might be currently/ Maybe Andrew can just fix it up if there are no other comments. Thanks!
> Let's track for each anonymous THP size, how many of them are currently > allocated. We'll track the complete lifespan of an anon THP, starting > when it becomes an anon THP ("large anon folio") (->mapping gets set), > until it gets freed (->mapping gets cleared). IIUC, If an anon THP is swapped out as a whole, it is still being counted, correct? > Note that AnonPages in /proc/meminfo currently tracks the total number > of *mapped* anonymous *pages*, and therefore has slightly different > semantics. In the future, we might also want to track "nr_anon_mapped" > for each THP size, which might be helpful when comparing it to the > number of allocated anon THPs (long-term pinning, stuck in swapcache, > memory leaks, ...). If we do not consider tracking each THP size, Can we expand the NR_ANON_THPS statistic to include pte-mapped thp as well? --- mm/memcontrol-v1.c | 2 +- mm/rmap.c | 5 +++++ 2 files changed, 6 insertions(+), 1 deletion(-) diff --git a/mm/memcontrol-v1.c b/mm/memcontrol-v1.c index 44803cbea38a..3e44175db81f 100644 --- a/mm/memcontrol-v1.c +++ b/mm/memcontrol-v1.c @@ -786,7 +786,7 @@ static int mem_cgroup_move_account(struct folio *folio, if (folio_mapped(folio)) { __mod_lruvec_state(from_vec, NR_ANON_MAPPED, -nr_pages); __mod_lruvec_state(to_vec, NR_ANON_MAPPED, nr_pages); - if (folio_test_pmd_mappable(folio)) { + if (folio_test_large(folio)) { __mod_lruvec_state(from_vec, NR_ANON_THPS, -nr_pages); __mod_lruvec_state(to_vec, NR_ANON_THPS, diff --git a/mm/rmap.c b/mm/rmap.c index a8797d1b3d49..97eb25d023ba 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1291,6 +1291,11 @@ static void __folio_mod_stat(struct folio *folio, int nr, int nr_pmdmapped) if (nr) { idx = folio_test_anon(folio) ? NR_ANON_MAPPED : NR_FILE_MAPPED; __lruvec_stat_mod_folio(folio, idx, nr); + + if (folio_test_anon(folio) && + folio_test_large(folio) && + nr == 1 << folio_order(folio)) + __lruvec_stat_mod_folio(folio, NR_ANON_THPS, nr); } if (nr_pmdmapped) { if (folio_test_anon(folio)) { --
diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst index 79435c537e21..b78f2148b242 100644 --- a/Documentation/admin-guide/mm/transhuge.rst +++ b/Documentation/admin-guide/mm/transhuge.rst @@ -551,6 +551,11 @@ split_deferred it would free up some memory. Pages on split queue are going to be split under memory pressure, if splitting is possible. +nr_anon + the number of transparent anon huge pages we have in the whole system. + These huge pages could be entirely mapped or have partially + unmapped/unused subpages. + As the system ages, allocating huge pages may be expensive as the system uses memory compaction to copy data around memory to free a huge page for use. There are some counters in ``/proc/vmstat`` to help diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 4c32058cacfe..2ee2971e4e10 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -126,6 +126,7 @@ enum mthp_stat_item { MTHP_STAT_SPLIT, MTHP_STAT_SPLIT_FAILED, MTHP_STAT_SPLIT_DEFERRED, + MTHP_STAT_NR_ANON, __MTHP_STAT_COUNT }; @@ -136,14 +137,24 @@ struct mthp_stat { DECLARE_PER_CPU(struct mthp_stat, mthp_stats); -static inline void count_mthp_stat(int order, enum mthp_stat_item item) +static inline void mod_mthp_stat(int order, enum mthp_stat_item item, int delta) { if (order <= 0 || order > PMD_ORDER) return; - this_cpu_inc(mthp_stats.stats[order][item]); + this_cpu_add(mthp_stats.stats[order][item], delta); +} + +static inline void count_mthp_stat(int order, enum mthp_stat_item item) +{ + mod_mthp_stat(order, item, 1); } + #else +static inline void mod_mthp_stat(int order, enum mthp_stat_item item, int delta) +{ +} + static inline void count_mthp_stat(int order, enum mthp_stat_item item) { } diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 513e7c87efee..26ad75fcda62 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -597,6 +597,7 @@ DEFINE_MTHP_STAT_ATTR(shmem_fallback_charge, MTHP_STAT_SHMEM_FALLBACK_CHARGE); DEFINE_MTHP_STAT_ATTR(split, MTHP_STAT_SPLIT); DEFINE_MTHP_STAT_ATTR(split_failed, MTHP_STAT_SPLIT_FAILED); DEFINE_MTHP_STAT_ATTR(split_deferred, MTHP_STAT_SPLIT_DEFERRED); +DEFINE_MTHP_STAT_ATTR(nr_anon, MTHP_STAT_NR_ANON); static struct attribute *anon_stats_attrs[] = { &anon_fault_alloc_attr.attr, @@ -609,6 +610,7 @@ static struct attribute *anon_stats_attrs[] = { &split_attr.attr, &split_failed_attr.attr, &split_deferred_attr.attr, + &nr_anon_attr.attr, NULL, }; @@ -3314,8 +3316,9 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list, struct deferred_split *ds_queue = get_deferred_split_queue(folio); /* reset xarray order to new order after split */ XA_STATE_ORDER(xas, &folio->mapping->i_pages, folio->index, new_order); - struct anon_vma *anon_vma = NULL; + bool is_anon = folio_test_anon(folio); struct address_space *mapping = NULL; + struct anon_vma *anon_vma = NULL; int order = folio_order(folio); int extra_pins, ret; pgoff_t end; @@ -3327,7 +3330,7 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list, if (new_order >= folio_order(folio)) return -EINVAL; - if (folio_test_anon(folio)) { + if (is_anon) { /* order-1 is not supported for anonymous THP. */ if (new_order == 1) { VM_WARN_ONCE(1, "Cannot split to order-1 folio"); @@ -3367,7 +3370,7 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list, if (folio_test_writeback(folio)) return -EBUSY; - if (folio_test_anon(folio)) { + if (is_anon) { /* * The caller does not necessarily hold an mmap_lock that would * prevent the anon_vma disappearing so we first we take a @@ -3480,6 +3483,10 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list, } } + if (is_anon) { + mod_mthp_stat(order, MTHP_STAT_NR_ANON, -1); + mod_mthp_stat(new_order, MTHP_STAT_NR_ANON, 1 << (order - new_order)); + } __split_huge_page(page, list, end, new_order); ret = 0; } else { diff --git a/mm/migrate.c b/mm/migrate.c index 4f55f4930fe8..3cc8555de6d6 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -450,6 +450,8 @@ static int __folio_migrate_mapping(struct address_space *mapping, /* No turning back from here */ newfolio->index = folio->index; newfolio->mapping = folio->mapping; + if (folio_test_anon(folio) && folio_test_large(folio)) + mod_mthp_stat(folio_order(folio), MTHP_STAT_NR_ANON, 1); if (folio_test_swapbacked(folio)) __folio_set_swapbacked(newfolio); @@ -474,6 +476,8 @@ static int __folio_migrate_mapping(struct address_space *mapping, */ newfolio->index = folio->index; newfolio->mapping = folio->mapping; + if (folio_test_anon(folio) && folio_test_large(folio)) + mod_mthp_stat(folio_order(folio), MTHP_STAT_NR_ANON, 1); folio_ref_add(newfolio, nr); /* add cache reference */ if (folio_test_swapbacked(folio)) { __folio_set_swapbacked(newfolio); diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 8a67d760b71a..7dcb0713eb57 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1084,8 +1084,11 @@ __always_inline bool free_pages_prepare(struct page *page, (page + i)->flags &= ~PAGE_FLAGS_CHECK_AT_PREP; } } - if (PageMappingFlags(page)) + if (PageMappingFlags(page)) { + if (PageAnon(page)) + mod_mthp_stat(order, MTHP_STAT_NR_ANON, -1); page->mapping = NULL; + } if (is_check_pages_enabled()) { if (free_page_is_bad(page)) bad++; diff --git a/mm/rmap.c b/mm/rmap.c index 1103a536e474..78529cf0fd66 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1467,6 +1467,7 @@ void folio_add_new_anon_rmap(struct folio *folio, struct vm_area_struct *vma, } __folio_mod_stat(folio, nr, nr_pmdmapped); + mod_mthp_stat(folio_order(folio), MTHP_STAT_NR_ANON, 1); } static __always_inline void __folio_add_file_rmap(struct folio *folio,