diff mbox series

[1/1] memcg/hugetlb: Adding hugeTLB counters to memory controller

Message ID 20241017160438.3893293-2-joshua.hahnjy@gmail.com (mailing list archive)
State New
Headers show
Series memcg/hugetlb: Adding hugeTLB counters to memory controller | expand

Commit Message

Joshua Hahn Oct. 17, 2024, 4:04 p.m. UTC
HugeTLB is added as a metric in memcg_stat_item, and is updated in the
alloc and free methods for hugeTLB, after (un)charging has already been
committed. Changes are batched and updated / flushed like the rest of
the memcg stats, which makes additional overhead by the infrequent
hugetlb allocs / frees minimal.

Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
---
 include/linux/memcontrol.h | 3 +++
 mm/hugetlb.c               | 5 +++++
 mm/memcontrol.c            | 6 ++++++
 3 files changed, 14 insertions(+)

Comments

Johannes Weiner Oct. 17, 2024, 5:22 p.m. UTC | #1
On Thu, Oct 17, 2024 at 09:04:38AM -0700, Joshua Hahn wrote:
> HugeTLB is added as a metric in memcg_stat_item, and is updated in the
> alloc and free methods for hugeTLB, after (un)charging has already been
> committed. Changes are batched and updated / flushed like the rest of
> the memcg stats, which makes additional overhead by the infrequent
> hugetlb allocs / frees minimal.
> 
> Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
> ---
>  include/linux/memcontrol.h | 3 +++
>  mm/hugetlb.c               | 5 +++++
>  mm/memcontrol.c            | 6 ++++++
>  3 files changed, 14 insertions(+)
> 
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index 34d2da05f2f1..66e925ae499a 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -39,6 +39,9 @@ enum memcg_stat_item {
>  	MEMCG_KMEM,
>  	MEMCG_ZSWAP_B,
>  	MEMCG_ZSWAPPED,
> +#ifdef CONFIG_HUGETLB_PAGE
> +	MEMCG_HUGETLB,
> +#endif
>  	MEMCG_NR_STAT,
>  };

It would be better to add a native vmstat counter for this, as there
is no strong reason to make this memcg specific. This would also make
it NUMA-node-aware.

IOW, add a new item to enum node_stat_item (plus the string in
vmstat.c and memcontrol.c).

> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 190fa05635f4..ca7151096712 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1887,6 +1887,7 @@ void free_huge_folio(struct folio *folio)
>  	struct hstate *h = folio_hstate(folio);
>  	int nid = folio_nid(folio);
>  	struct hugepage_subpool *spool = hugetlb_folio_subpool(folio);
> +	struct mem_cgroup *memcg = get_mem_cgroup_from_current();
>  	bool restore_reserve;
>  	unsigned long flags;
>  
> @@ -1926,6 +1927,8 @@ void free_huge_folio(struct folio *folio)
>  	hugetlb_cgroup_uncharge_folio_rsvd(hstate_index(h),
>  					  pages_per_huge_page(h), folio);
>  	mem_cgroup_uncharge(folio);
> +	mod_memcg_state(memcg, MEMCG_HUGETLB, -pages_per_huge_page(h));
> +	mem_cgroup_put(memcg);
>  	if (restore_reserve)
>  		h->resv_huge_pages++;

This goes wrong if the folio is freed by somebody other than the
owning cgroup. For example if the task moved between cgroups after the
memory was charged.

It's better to use the folio->memcg linkage that was established by
the allocation path.

Use lruvec_stat_mod_folio(), it will handle all of this.

> @@ -3093,6 +3096,8 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma,
>  
>  	if (!memcg_charge_ret)
>  		mem_cgroup_commit_charge(folio, memcg);
> +
> +	mod_memcg_state(memcg, MEMCG_HUGETLB, nr_pages);

And here as well.
Shakeel Butt Oct. 18, 2024, 9:34 p.m. UTC | #2
On Thu, Oct 17, 2024 at 09:04:38AM GMT, Joshua Hahn wrote:
> HugeTLB is added as a metric in memcg_stat_item, and is updated in the
> alloc and free methods for hugeTLB, after (un)charging has already been
> committed. Changes are batched and updated / flushed like the rest of
> the memcg stats, which makes additional overhead by the infrequent
> hugetlb allocs / frees minimal.
> 
> Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>

I have an orthogonal cleanup request (i.e. after you are done with this
work). Hugetlb is the last user of try-charge + commit protocol for
memcg charging. I think we should just remove that and use a simple
charge interface. You will need to reorder couple of things like
allocating the folio first and then charge and you will need to do right
cleanup on charge failing but I think it will cleanup the error path of
alloc_hugetlb_folio() a lot.
Joshua Hahn Oct. 19, 2024, 10:45 p.m. UTC | #3
On Fri, Oct 18, 2024 at 5:34 PM Shakeel Butt <shakeel.butt@linux.dev> wrote:
>
> On Thu, Oct 17, 2024 at 09:04:38AM GMT, Joshua Hahn wrote:
> > HugeTLB is added as a metric in memcg_stat_item, and is updated in the
> > alloc and free methods for hugeTLB, after (un)charging has already been
> > committed. Changes are batched and updated / flushed like the rest of
> > the memcg stats, which makes additional overhead by the infrequent
> > hugetlb allocs / frees minimal.
> >
> > Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
>
> I have an orthogonal cleanup request (i.e. after you are done with this
> work). Hugetlb is the last user of try-charge + commit protocol for
> memcg charging. I think we should just remove that and use a simple
> charge interface. You will need to reorder couple of things like
> allocating the folio first and then charge and you will need to do right
> cleanup on charge failing but I think it will cleanup the error path of
> alloc_hugetlb_folio() a lot.

That sounds good to me. I was originally planning to include the
hugeTLB accounting in the try charging mechanism (as to only include
it in memory.stat if it is also accounted for in memory.current. I will
think of another way to do this accounting so that cleanup becomes
easier once this patch is finished. One way I can think of is just to
check for the hugeTLB accounting config before adding the stats
and accounting for them.

Thank you for your feedback!

Joshua
diff mbox series

Patch

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 34d2da05f2f1..66e925ae499a 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -39,6 +39,9 @@  enum memcg_stat_item {
 	MEMCG_KMEM,
 	MEMCG_ZSWAP_B,
 	MEMCG_ZSWAPPED,
+#ifdef CONFIG_HUGETLB_PAGE
+	MEMCG_HUGETLB,
+#endif
 	MEMCG_NR_STAT,
 };
 
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 190fa05635f4..ca7151096712 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1887,6 +1887,7 @@  void free_huge_folio(struct folio *folio)
 	struct hstate *h = folio_hstate(folio);
 	int nid = folio_nid(folio);
 	struct hugepage_subpool *spool = hugetlb_folio_subpool(folio);
+	struct mem_cgroup *memcg = get_mem_cgroup_from_current();
 	bool restore_reserve;
 	unsigned long flags;
 
@@ -1926,6 +1927,8 @@  void free_huge_folio(struct folio *folio)
 	hugetlb_cgroup_uncharge_folio_rsvd(hstate_index(h),
 					  pages_per_huge_page(h), folio);
 	mem_cgroup_uncharge(folio);
+	mod_memcg_state(memcg, MEMCG_HUGETLB, -pages_per_huge_page(h));
+	mem_cgroup_put(memcg);
 	if (restore_reserve)
 		h->resv_huge_pages++;
 
@@ -3093,6 +3096,8 @@  struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma,
 
 	if (!memcg_charge_ret)
 		mem_cgroup_commit_charge(folio, memcg);
+
+	mod_memcg_state(memcg, MEMCG_HUGETLB, nr_pages);
 	mem_cgroup_put(memcg);
 
 	return folio;
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 7845c64a2c57..4180ee876adb 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -320,6 +320,9 @@  static const unsigned int memcg_stat_items[] = {
 	MEMCG_KMEM,
 	MEMCG_ZSWAP_B,
 	MEMCG_ZSWAPPED,
+#ifdef CONFIG_HUGETLB_PAGE
+	MEMCG_HUGETLB,
+#endif
 };
 
 #define NR_MEMCG_NODE_STAT_ITEMS ARRAY_SIZE(memcg_node_stat_items)
@@ -1324,6 +1327,9 @@  static const struct memory_stat memory_stats[] = {
 	{ "sock",			MEMCG_SOCK			},
 	{ "vmalloc",			MEMCG_VMALLOC			},
 	{ "shmem",			NR_SHMEM			},
+#ifdef CONFIG_HUGETLB_PAGE
+	{ "hugeTLB",			MEMCG_HUGETLB			},
+#endif
 #ifdef CONFIG_ZSWAP
 	{ "zswap",			MEMCG_ZSWAP_B			},
 	{ "zswapped",			MEMCG_ZSWAPPED			},