Message ID | 20220118182600.15007-1-mkoutny@suse.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm/memcg: Do not check v1 event counter when not needed | expand |
On 2022-01-18 19:26:00 [+0100], Michal Koutný wrote: > I think it would make sense inserting the patch into your series and > subsequently reject enabling on PREEMPT_RT -- provided this patch makes sense > to others too -- the justification is rather functionality splitting for > this PREEMPT_RT effort. Interesting. So while looking at this today I came up with the patch at the bottom. The other things I had looked way uglier and then since nobody probably will use it… Let me know how you want it to be integrated. ------>8------ From: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Date: Tue, 18 Jan 2022 17:28:07 +0100 Subject: [PATCH] mm/memcg: Disable threshold event handlers on PREEMPT_RT MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit During the integration of PREEMPT_RT support, the code flow around memcg_check_events() resulted in `twisted code'. Moving the code around and avoiding then would then lead to an additional local-irq-save section within memcg_check_events(). While looking better, it adds a local-irq-save section to code flow which is usually within an local-irq-save block. The threshold event handler is a deprecated memcg v1 feature. Instead of trying to get it to work under PREEMPT_RT just disable it. There should have not been any users on PREEMPT_RT. From that perspective makes it even less sense to get it to work under PREEMPT_RT while having zero users. Make memory.soft_limit_in_bytes and cgroup.event_control return -EOPNOTSUPP on PREEMPT_RT. Make memcg_check_events() empty on PREEMPT_RT since it won't do anything. Document that the two knobs are disabled on PREEMPT_RT. Suggested-by: Michal Hocko <mhocko@kernel.org> Suggested-by: Michal Koutný <mkoutny@suse.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> --- Documentation/admin-guide/cgroup-v1/memory.rst | 2 ++ mm/memcontrol.c | 12 ++++++++++++ 2 files changed, 14 insertions(+) diff --git a/Documentation/admin-guide/cgroup-v1/memory.rst b/Documentation/admin-guide/cgroup-v1/memory.rst index faac50149a222..2cc502a75ef64 100644 --- a/Documentation/admin-guide/cgroup-v1/memory.rst +++ b/Documentation/admin-guide/cgroup-v1/memory.rst @@ -64,6 +64,7 @@ Brief summary of control files. threads cgroup.procs show list of processes cgroup.event_control an interface for event_fd() + This knob is not available on CONFIG_PREEMPT_RT systems. memory.usage_in_bytes show current usage for memory (See 5.5 for details) memory.memsw.usage_in_bytes show current usage for memory+Swap @@ -75,6 +76,7 @@ Brief summary of control files. memory.max_usage_in_bytes show max memory usage recorded memory.memsw.max_usage_in_bytes show max memory+Swap usage recorded memory.soft_limit_in_bytes set/show soft limit of memory usage + This knob is not available on CONFIG_PREEMPT_RT systems. memory.stat show various statistics memory.use_hierarchy set/show hierarchical account enabled This knob is deprecated and shouldn't be diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 2ed5f2a0879d3..3c4f7a0fd0039 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -821,6 +821,7 @@ static void mem_cgroup_charge_statistics(struct mem_cgroup *memcg, __this_cpu_add(memcg->vmstats_percpu->nr_page_events, nr_pages); } +#ifndef CONFIG_PREEMPT_RT static bool mem_cgroup_event_ratelimit(struct mem_cgroup *memcg, enum mem_cgroup_events_target target) { @@ -864,6 +865,9 @@ static void memcg_check_events(struct mem_cgroup *memcg, int nid) mem_cgroup_update_tree(memcg, nid); } } +#else +static void memcg_check_events(struct mem_cgroup *memcg, int nid) { } +#endif struct mem_cgroup *mem_cgroup_from_task(struct task_struct *p) { @@ -3751,8 +3755,12 @@ static ssize_t mem_cgroup_write(struct kernfs_open_file *of, } break; case RES_SOFT_LIMIT: +#ifndef CONFIG_PREEMPT_RT memcg->soft_limit = nr_pages; ret = 0; +#else + ret = -EOPNOTSUPP; +#endif break; } return ret ?: nbytes; @@ -4717,6 +4725,7 @@ static void memcg_event_ptable_queue_proc(struct file *file, static ssize_t memcg_write_event_control(struct kernfs_open_file *of, char *buf, size_t nbytes, loff_t off) { +#ifndef CONFIG_PREEMPT_RT struct cgroup_subsys_state *css = of_css(of); struct mem_cgroup *memcg = mem_cgroup_from_css(css); struct mem_cgroup_event *event; @@ -4843,6 +4852,9 @@ static ssize_t memcg_write_event_control(struct kernfs_open_file *of, kfree(event); return ret; +#else + return -EOPNOTSUPP; +#endif } static struct cftype mem_cgroup_legacy_files[] = {
diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 4a7b3ebf8e48..7f64ce33d137 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -106,6 +106,8 @@ static bool do_memsw_account(void) #define THRESHOLDS_EVENTS_TARGET 128 #define SOFTLIMIT_EVENTS_TARGET 1024 +DEFINE_STATIC_KEY_FALSE(memcg_v1_events_enabled_key); + /* * Cgroups above their limits are maintained in a RB-Tree, independent of * their hierarchy representation @@ -852,6 +854,9 @@ static bool mem_cgroup_event_ratelimit(struct mem_cgroup *memcg, */ static void memcg_check_events(struct mem_cgroup *memcg, int nid) { + if (!static_branch_unlikely(&memcg_v1_events_enabled_key)) + return; + /* threshold event is triggered in finer grain than soft limit */ if (unlikely(mem_cgroup_event_ratelimit(memcg, MEM_CGROUP_TARGET_THRESH))) { @@ -3757,6 +3762,7 @@ static ssize_t mem_cgroup_write(struct kernfs_open_file *of, break; case RES_SOFT_LIMIT: memcg->soft_limit = nr_pages; + static_branch_enable(&memcg_v1_events_enabled_key); ret = 0; break; } @@ -4831,6 +4837,8 @@ static ssize_t memcg_write_event_control(struct kernfs_open_file *of, list_add(&event->list, &memcg->event_list); spin_unlock_irq(&memcg->event_list_lock); + static_branch_enable(&memcg_v1_events_enabled_key); + fdput(cfile); fdput(efile);