diff mbox series

memcg: no refill for offlined objcg

Message ID 20250410210535.1005312-1-shakeel.butt@linux.dev (mailing list archive)
State New
Headers show
Series memcg: no refill for offlined objcg | expand

Commit Message

Shakeel Butt April 10, 2025, 9:05 p.m. UTC
In our fleet, we are observing refill_obj_stock() spending a lot of cpu
in obj_cgroup_get() and on further inspection it seems like the given
objcg is offlined and the kernel has to take the slow path i.e. atomic
operations for objcg reference counting.

Other than expensive atomic operations, refilling stock of an offlined
objcg is a waster as there will not be new allocations for the offlined
objcg. In addition, refilling triggers flush of the previous objcg which
might be used in future. So, let's just avoid refilling the stock with
the offlined objcg.

Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
---
 mm/memcontrol.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

Comments

Roman Gushchin April 10, 2025, 11:59 p.m. UTC | #1
Shakeel Butt <shakeel.butt@linux.dev> writes:

> In our fleet, we are observing refill_obj_stock() spending a lot of cpu
> in obj_cgroup_get() and on further inspection it seems like the given
> objcg is offlined and the kernel has to take the slow path i.e. atomic
> operations for objcg reference counting.
>
> Other than expensive atomic operations, refilling stock of an offlined
> objcg is a waster as there will not be new allocations for the offlined
> objcg. In addition, refilling triggers flush of the previous objcg which
> might be used in future. So, let's just avoid refilling the stock with
> the offlined objcg.

Hm, but on the other side if there are multiple uncharges in a row,
refilling obj stocks might be still cheaper?

In general I think that switching to atomic css refcnt on memcg
offlining is a mistake - it makes memory reclaim generally more
expensive. We can simple delay it until the approximate refcnt
number reaches some low value, e.g. 100 objects.
Shakeel Butt April 14, 2025, 6:44 p.m. UTC | #2
On Thu, Apr 10, 2025 at 11:59:47PM +0000, Roman Gushchin wrote:
> Shakeel Butt <shakeel.butt@linux.dev> writes:
> 
> > In our fleet, we are observing refill_obj_stock() spending a lot of cpu
> > in obj_cgroup_get() and on further inspection it seems like the given
> > objcg is offlined and the kernel has to take the slow path i.e. atomic
> > operations for objcg reference counting.
> >
> > Other than expensive atomic operations, refilling stock of an offlined
> > objcg is a waster as there will not be new allocations for the offlined
> > objcg. In addition, refilling triggers flush of the previous objcg which
> > might be used in future. So, let's just avoid refilling the stock with
> > the offlined objcg.
> 
> Hm, but on the other side if there are multiple uncharges in a row,
> refilling obj stocks might be still cheaper?
> 

Thanks for the review. I looked at the fleet data again and what you are
suspecting i.e. multiple objects of same objcg getting freed closeby is
possible. I think I should be optimizing/batching at the upper layer. I
will be looking at rcu freeing side which is the most obvious batching
opportunity.

> In general I think that switching to atomic css refcnt on memcg
> offlining is a mistake - it makes memory reclaim generally more
> expensive. We can simple delay it until the approximate refcnt
> number reaches some low value, e.g. 100 objects.

This is a good idea. For the memcg, I think after Muchun's LRU
reparenting, we should not have zombie refcnt slowdown but for objcg
this idea might help. Anyways this is for later.
diff mbox series

Patch

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 2178a051bd09..23c62ae6a8c6 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2474,6 +2474,17 @@  static inline void __mod_objcg_mlstate(struct obj_cgroup *objcg,
 	rcu_read_unlock();
 }
 
+static inline void mod_objcg_mlstate(struct obj_cgroup *objcg,
+				     struct pglist_data *pgdat,
+				     enum node_stat_item idx, int nr)
+{
+	unsigned long flags;
+
+	local_irq_save(flags);
+	__mod_objcg_mlstate(objcg, pgdat, idx, nr);
+	local_irq_restore(flags);
+}
+
 static __always_inline
 struct mem_cgroup *mem_cgroup_from_obj_folio(struct folio *folio, void *p)
 {
@@ -2925,6 +2936,13 @@  static void refill_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes,
 	unsigned long flags;
 	unsigned int nr_pages = 0;
 
+	if (unlikely(percpu_ref_is_dying(&objcg->refcnt))) {
+		atomic_add(nr_bytes, &objcg->nr_charged_bytes);
+		if (pgdat)
+			mod_objcg_mlstate(objcg, pgdat, idx, nr_acct);
+		return;
+	}
+
 	local_lock_irqsave(&memcg_stock.stock_lock, flags);
 
 	stock = this_cpu_ptr(&memcg_stock);