Message ID | BL0PR02MB560170CD4D4245D4B89BC22EE9F40@BL0PR02MB5601.namprd02.prod.outlook.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [RFC] memcg: fix default behaviour of non-overridden memcg.swappiness | expand |
On Thu 19-03-20 17:38:30, Ivan Teterevkov wrote: > This patch tries to resolve uncertainty around the memcg.swappiness when > it's not overridden by the user: shall there be the latest vm_swappiness > or the value captured at the moment when the cgroup was created? > > I'm sitting on the fence with regards to this patch because cgroup v1 is > considered legacy nowadays and the semantics of "swappiness" is already > overwhelmed. However, the patch might be considered as a "fix" because > looking at the documentation [1] one might have the impression that it's > the latest /proc/sys/vm/swappiness value that should be found in the > memcg.swappiness unless it's overridden or inherited from a cgroup where > it was overridden when the given cgroup was created. Could you be more specific what makes you think this? Let me quote the whole thing here : 5.3 swappiness : -------------- : : Overrides /proc/sys/vm/swappiness for the particular group. The tunable : in the root cgroup corresponds to the global swappiness setting. : : Please note that unlike during the global reclaim, limit reclaim : enforces that 0 swappiness really prevents from any swapping even if : there is a swap storage available. This might lead to memcg OOM killer : if there are no file pages to reclaim. I do not want to pick on words here but to me it sounds this tunable is clearly documented as the explicit override for the global value. The root memcg corresponds to the global limit because root tends to be special in many other aspects. But in general, the semantic of knobs is that they do not unexpectedly change their values without an explicit user/admin intervention. > > Also, shall this magic -1 be exposed to the user? I think it's a "no", > but what if the user wants to un-override the memcg.swappiness... If we are to use such a semantic then it absolutely has to be an opt-in behavior and expressed in some way to the user space (e.g. a symbolic name referring to the global setting). > > What do you reckon? I am not convinced we need it. There would have to be a real life usecase that cannot really work with the current semantic. I remember that this has been brought up when discussing early swappiness initialization [1]. But it seems there is a much better solution for that problem [2]. [1] http://lkml.kernel.org/r/BL0PR02MB560167492CA4094C91589930E9FC0@BL0PR02MB5601.namprd02.prod.outlook.com [2] http://lkml.kernel.org/r/20200317132105.24555-1-vbabka@suse.cz
On Fri, 20 Mar 2020, Michal Hocko wrote: > On Thu 19-03-20 17:38:30, Ivan Teterevkov wrote: > > Also, shall this magic -1 be exposed to the user? I think it's a "no", > > but what if the user wants to un-override the memcg.swappiness... > > If we are to use such a semantic then it absolutely has to be an opt-in behavior > and expressed in some way to the user space (e.g. a symbolic name referring to > the global setting). A symbolic link would be a good approach but... > I am not convinced we need it. ... agree and not going any further with the suggestion. Support of the sysctl parameters in the kernel command line is a better solution and would address my initially raised concern to tackle the configuration parameters. Thanks Ivan
diff --git a/Documentation/admin-guide/cgroup-v1/memory.rst b/Documentation/admin-guide/cgroup-v1/memory.rst index 0ae4f564c2d6..ccb4046c0aa3 100644 --- a/Documentation/admin-guide/cgroup-v1/memory.rst +++ b/Documentation/admin-guide/cgroup-v1/memory.rst @@ -610,8 +610,11 @@ Note: 5.3 swappiness -------------- -Overrides /proc/sys/vm/swappiness for the particular group. The tunable -in the root cgroup corresponds to the global swappiness setting. +Overrides /proc/sys/vm/swappiness for the particular cgroup. The overridden +memory.swappiness in the non-root cgroup is inherited by new child cgroups. +The tunable in the root cgroup corresponds to the global swappiness setting; +changes made there are also applied to the non-overridden memory.swappiness +of the non-root cgroups. Please note that unlike during the global reclaim, limit reclaim enforces that 0 swappiness really prevents from any swapping even if diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index a7a0a1a5c8d5..b5d69648be88 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -240,7 +240,22 @@ struct mem_cgroup { bool oom_lock; int under_oom; + /* + * Overrides the global vm_swappiness, unless there's a special case: + * + * - The swappiness in the root cgroup always corresponds to the global + * vm_swappiness and the value below is ignored. + * + * - The default value -1 means the cgroup uses the global + * vm_swappiness. + * + * - The value 0 prevents any swapping in the cgroup. + * + * Otherwise, any integer from 1 to 100 overrides the vm_swappiness + * and is inherited by new child cgroups. + */ int swappiness; + /* OOM-Killer disable */ int oom_kill_disable; diff --git a/include/linux/swap.h b/include/linux/swap.h index 1e99f7ac1d7e..d4c65ebcae61 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -636,6 +636,10 @@ static inline int mem_cgroup_swappiness(struct mem_cgroup *memcg) if (mem_cgroup_disabled() || mem_cgroup_is_root(memcg)) return vm_swappiness; + /* Not overridden? */ + if (memcg->swappiness == -1) + return vm_swappiness; + return memcg->swappiness; } #else diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 2058b8da18db..a95a7df46442 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -4980,8 +4980,10 @@ mem_cgroup_css_alloc(struct cgroup_subsys_state *parent_css) memcg->high = PAGE_COUNTER_MAX; memcg->soft_limit = PAGE_COUNTER_MAX; if (parent) { - memcg->swappiness = mem_cgroup_swappiness(parent); + memcg->swappiness = parent->swappiness; memcg->oom_kill_disable = parent->oom_kill_disable; + } else { + memcg->swappiness = -1; } if (parent && parent->use_hierarchy) { memcg->use_hierarchy = true;
This patch tries to resolve uncertainty around the memcg.swappiness when it's not overridden by the user: shall there be the latest vm_swappiness or the value captured at the moment when the cgroup was created? I'm sitting on the fence with regards to this patch because cgroup v1 is considered legacy nowadays and the semantics of "swappiness" is already overwhelmed. However, the patch might be considered as a "fix" because looking at the documentation [1] one might have the impression that it's the latest /proc/sys/vm/swappiness value that should be found in the memcg.swappiness unless it's overridden or inherited from a cgroup where it was overridden when the given cgroup was created. Also, shall this magic -1 be exposed to the user? I think it's a "no", but what if the user wants to un-override the memcg.swappiness... What do you reckon? [1] https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v1/memory.html#swappiness -------------------------------- %< -------------------------------- This patch makes the memcg with the non-overridden swappiness use the latest value found in /proc/sys/vm/swappiness instead of one captured at the time when the memcg was created. Signed-off-by: Ivan Teterevkov <ivan.teterevkov@nutanix.com> --- Documentation/admin-guide/cgroup-v1/memory.rst | 7 +++++-- include/linux/memcontrol.h | 15 +++++++++++++++ include/linux/swap.h | 4 ++++ mm/memcontrol.c | 4 +++- 4 files changed, 27 insertions(+), 3 deletions(-)