Message ID | 20221220182745.1903540-3-roman.gushchin@linux.dev (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm: kmem: optimize obj_cgroup pointer retrieval | expand |
On Tue, Dec 20, 2022 at 10:27:45AM -0800, Roman Gushchin wrote: > To charge a freshly allocated kernel object to a memory cgroup, the > kernel needs to obtain an objcg pointer. Currently it does it > indirectly by obtaining the memcg pointer first and then calling to > __get_obj_cgroup_from_memcg(). > > Usually tasks spend their entire life belonging to the same object > cgroup. So it makes sense to save the objcg pointer on task_struct > directly, so it can be obtained faster. It requires some work on fork, > exit and cgroup migrate paths, but these paths are way colder. > > The old indirect way is still used for remote memcg charging. > > Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev> This looks good too. Few comments below: [...] > + > +#ifdef CONFIG_MEMCG_KMEM > +static void mem_cgroup_kmem_attach(struct cgroup_taskset *tset) > +{ > + struct task_struct *task; > + struct cgroup_subsys_state *css; > + > + cgroup_taskset_for_each(task, css, tset) { > + struct mem_cgroup *memcg; > + > + if (task->objcg) > + obj_cgroup_put(task->objcg); > + > + rcu_read_lock(); > + memcg = container_of(css, struct mem_cgroup, css); > + task->objcg = __get_obj_cgroup_from_memcg(memcg); > + rcu_read_unlock(); > + } > +} > +#else > +static void mem_cgroup_kmem_attach(struct cgroup_taskset *tset) {} > +#endif /* CONFIG_MEMCG_KMEM */ > + > +#if defined(CONFIG_MEMCG_KMEM) || defined(CONFIG_MEMCG_KMEM) I think you want CONFIG_LRU_GEN in the above check. > static void mem_cgroup_attach(struct cgroup_taskset *tset) > { > + mem_cgroup_lru_gen_attach(tset); > + mem_cgroup_kmem_attach(tset); > } > -#endif /* CONFIG_LRU_GEN */ > +#endif > > static int seq_puts_memcg_tunable(struct seq_file *m, unsigned long value) > { > @@ -6816,9 +6872,15 @@ struct cgroup_subsys memory_cgrp_subsys = { > .css_reset = mem_cgroup_css_reset, > .css_rstat_flush = mem_cgroup_css_rstat_flush, > .can_attach = mem_cgroup_can_attach, > +#if defined(CONFIG_MEMCG_KMEM) || defined(CONFIG_MEMCG_KMEM) Same here. > .attach = mem_cgroup_attach, > +#endif > .cancel_attach = mem_cgroup_cancel_attach, > .post_attach = mem_cgroup_move_task, > +#ifdef CONFIG_MEMCG_KMEM > + .fork = mem_cgroup_fork, > + .exit = mem_cgroup_exit, > +#endif > .dfl_cftypes = memory_files, > .legacy_cftypes = mem_cgroup_legacy_files, > .early_init = 0, > -- > 2.39.0 >
On Tue, Dec 20, 2022 at 10:27:45AM -0800, Roman Gushchin <roman.gushchin@linux.dev> wrote: > To charge a freshly allocated kernel object to a memory cgroup, the > kernel needs to obtain an objcg pointer. Currently it does it > indirectly by obtaining the memcg pointer first and then calling to > __get_obj_cgroup_from_memcg(). Jinx [1]. You report additional 7% improvement with this patch (focused on allocations only). I didn't see impressive numbers (different benchmark in [1]), so it looked as a microoptimization without big benefit to me. My 0.02€ to RFC, Michal [1] https://bugzilla.kernel.org/show_bug.cgi?id=216038#c5
On Thu, Dec 22, 2022 at 02:50:44PM +0100, Michal Koutný wrote: > On Tue, Dec 20, 2022 at 10:27:45AM -0800, Roman Gushchin <roman.gushchin@linux.dev> wrote: > > To charge a freshly allocated kernel object to a memory cgroup, the > > kernel needs to obtain an objcg pointer. Currently it does it > > indirectly by obtaining the memcg pointer first and then calling to > > __get_obj_cgroup_from_memcg(). > > Jinx [1]. > > You report additional 7% improvement with this patch (focused on > allocations only). I didn't see impressive numbers (different benchmark > in [1]), so it looked as a microoptimization without big benefit to me. Hi Michal! Thank you for taking a look. Do you have any numbers to share? In general, I agree that it's a micro-optimization, but: 1) some people periodically complain that accounted allocations are slow in comparison to non-accounted and slower than they were with page-based accounting, 2) I don't see any particular hot point or obviously non-optimal place on the allocation path. so if we want to make it faster, we have to micro-optimize it here and there, no other way. It's basically the question how many cache lines we touch. Btw, I'm working on a patch 3 for this series, which in early tests brings additional ~25% improvement in my benchmark, hopefully will post it soon as a part of v1. Thanks!
Hello. On Thu, Dec 22, 2022 at 08:21:49AM -0800, Roman Gushchin <roman.gushchin@linux.dev> wrote: > Do you have any numbers to share? The numbers are in bko#216038, let me explain them here a bit. I used the will-it-scale benchmark that repeatedly locks/unlocks a file and runs in parallel. The final numbers were: sample metric δ δ_cg no accounting implemented 32307750 0 % accounting in cg 2.49577e+07 -23 % 0 % accounting in cg + cache 2.51642e+07 -22 % +1 % Hence my result was only 1% improvement. (But it was a very simple try, not delving into any of the CPU cache statistics.) Question: Were your measurements multi-threaded? > 1) some people periodically complain that accounted allocations are slow > in comparison to non-accounted and slower than they were with page-based > accounting, My result above would not likely satisfy those complainers I know about. But if your additional changes are better the additional code complexity may be justified in the end. > Btw, I'm working on a patch 3 for this series, which in early tests brings > additional ~25% improvement in my benchmark, hopefully will post it soon as > a part of v1. Please send it with more details about your benchmark to put the numbers into context. Michal
diff --git a/include/linux/sched.h b/include/linux/sched.h index 853d08f7562b..e17be609cbcb 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1435,6 +1435,10 @@ struct task_struct { struct mem_cgroup *active_memcg; #endif +#ifdef CONFIG_MEMCG_KMEM + struct obj_cgroup *objcg; +#endif + #ifdef CONFIG_BLK_CGROUP struct request_queue *throttle_queue; #endif diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 82828c51d2ea..e0547b224f40 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -3001,23 +3001,29 @@ static struct obj_cgroup *__get_obj_cgroup_from_memcg(struct mem_cgroup *memcg) __always_inline struct obj_cgroup *get_obj_cgroup_from_current(void) { struct mem_cgroup *memcg; - struct obj_cgroup *objcg; + struct obj_cgroup *objcg = NULL; if (in_task()) { memcg = current->active_memcg; - - /* Memcg to charge can't be determined. */ - if (likely(!memcg) && (!current->mm || (current->flags & PF_KTHREAD))) - return NULL; + if (unlikely(memcg)) + goto from_memcg; + + if (current->objcg) { + rcu_read_lock(); + do { + objcg = READ_ONCE(current->objcg); + } while (objcg && !obj_cgroup_tryget(objcg)); + rcu_read_unlock(); + } } else { memcg = this_cpu_read(int_active_memcg); - if (likely(!memcg)) - return NULL; + if (unlikely(memcg)) + goto from_memcg; } + return objcg; +from_memcg: rcu_read_lock(); - if (!memcg) - memcg = mem_cgroup_from_task(current); objcg = __get_obj_cgroup_from_memcg(memcg); rcu_read_unlock(); return objcg; @@ -6303,6 +6309,28 @@ static void mem_cgroup_move_task(void) mem_cgroup_clear_mc(); } } + +#ifdef CONFIG_MEMCG_KMEM +static void mem_cgroup_fork(struct task_struct *task) +{ + struct mem_cgroup *memcg; + + rcu_read_lock(); + memcg = mem_cgroup_from_task(task); + if (!memcg || mem_cgroup_is_root(memcg)) + task->objcg = NULL; + else + task->objcg = __get_obj_cgroup_from_memcg(memcg); + rcu_read_unlock(); +} + +static void mem_cgroup_exit(struct task_struct *task) +{ + if (task->objcg) + obj_cgroup_put(task->objcg); +} +#endif + #else /* !CONFIG_MMU */ static int mem_cgroup_can_attach(struct cgroup_taskset *tset) { @@ -6317,7 +6345,7 @@ static void mem_cgroup_move_task(void) #endif #ifdef CONFIG_LRU_GEN -static void mem_cgroup_attach(struct cgroup_taskset *tset) +static void mem_cgroup_lru_gen_attach(struct cgroup_taskset *tset) { struct task_struct *task; struct cgroup_subsys_state *css; @@ -6335,10 +6363,38 @@ static void mem_cgroup_attach(struct cgroup_taskset *tset) task_unlock(task); } #else +static void mem_cgroup_lru_gen_attach(struct cgroup_taskset *tset) {} +#endif /* CONFIG_LRU_GEN */ + +#ifdef CONFIG_MEMCG_KMEM +static void mem_cgroup_kmem_attach(struct cgroup_taskset *tset) +{ + struct task_struct *task; + struct cgroup_subsys_state *css; + + cgroup_taskset_for_each(task, css, tset) { + struct mem_cgroup *memcg; + + if (task->objcg) + obj_cgroup_put(task->objcg); + + rcu_read_lock(); + memcg = container_of(css, struct mem_cgroup, css); + task->objcg = __get_obj_cgroup_from_memcg(memcg); + rcu_read_unlock(); + } +} +#else +static void mem_cgroup_kmem_attach(struct cgroup_taskset *tset) {} +#endif /* CONFIG_MEMCG_KMEM */ + +#if defined(CONFIG_MEMCG_KMEM) || defined(CONFIG_MEMCG_KMEM) static void mem_cgroup_attach(struct cgroup_taskset *tset) { + mem_cgroup_lru_gen_attach(tset); + mem_cgroup_kmem_attach(tset); } -#endif /* CONFIG_LRU_GEN */ +#endif static int seq_puts_memcg_tunable(struct seq_file *m, unsigned long value) { @@ -6816,9 +6872,15 @@ struct cgroup_subsys memory_cgrp_subsys = { .css_reset = mem_cgroup_css_reset, .css_rstat_flush = mem_cgroup_css_rstat_flush, .can_attach = mem_cgroup_can_attach, +#if defined(CONFIG_MEMCG_KMEM) || defined(CONFIG_MEMCG_KMEM) .attach = mem_cgroup_attach, +#endif .cancel_attach = mem_cgroup_cancel_attach, .post_attach = mem_cgroup_move_task, +#ifdef CONFIG_MEMCG_KMEM + .fork = mem_cgroup_fork, + .exit = mem_cgroup_exit, +#endif .dfl_cftypes = memory_files, .legacy_cftypes = mem_cgroup_legacy_files, .early_init = 0,
To charge a freshly allocated kernel object to a memory cgroup, the kernel needs to obtain an objcg pointer. Currently it does it indirectly by obtaining the memcg pointer first and then calling to __get_obj_cgroup_from_memcg(). Usually tasks spend their entire life belonging to the same object cgroup. So it makes sense to save the objcg pointer on task_struct directly, so it can be obtained faster. It requires some work on fork, exit and cgroup migrate paths, but these paths are way colder. The old indirect way is still used for remote memcg charging. Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev> --- include/linux/sched.h | 4 +++ mm/memcontrol.c | 84 +++++++++++++++++++++++++++++++++++++------ 2 files changed, 77 insertions(+), 11 deletions(-)