Message ID | 20180522132528.23769-2-guro@fb.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue 22-05-18 14:25:28, Roman Gushchin wrote: > There are two cases when effective memory guarantee calculation > is mistakenly skipped: > > 1) If memcg is a child of the root cgroup, and the root > cgroup is not root_mem_cgroup (in other words, if the reclaim > is targeted). Top-level memory cgroups are handled specially > in mem_cgroup_protected(), because the root memory cgroup doesn't > have memory guarantee and can't limit its children guarantees. > So, all effective guarantee calculation is skipped. > But in case of targeted reclaim things are different: > cgroups, which parent exceeded its memory limit aren't special. > > 2) If memcg has no charged memory (memory usage is 0). In this > case mem_cgroup_protected() always returns MEMCG_PROT_NONE, which > is correct and prevents to generate fake memory low events for > empty cgroups. But skipping memory emin/elow calculation is wrong: > if there is no global memory pressure there might be no good > chance again, so we can end up with effective guarantees set to 0 > without any reason. Roman, so these two patches are on top of the min limit patches, right? The fact that they come after just makes me feel this whole thing is not completely thought through and I would like to see all 4 patch in one series describing the whole design. We are getting really close to the merge window and last minute updates makes me really nervouse. Can you please repost the whole thing after the merge window, please? As I've said earlier I am not even sure we really want to have a hard guarantee once we decided to go with low limit. So a very good reasoning should be added for the whole thing. Thanks!
On Mon, Jun 04, 2018 at 02:29:53PM +0200, Michal Hocko wrote: > On Tue 22-05-18 14:25:28, Roman Gushchin wrote: > > There are two cases when effective memory guarantee calculation > > is mistakenly skipped: > > > > 1) If memcg is a child of the root cgroup, and the root > > cgroup is not root_mem_cgroup (in other words, if the reclaim > > is targeted). Top-level memory cgroups are handled specially > > in mem_cgroup_protected(), because the root memory cgroup doesn't > > have memory guarantee and can't limit its children guarantees. > > So, all effective guarantee calculation is skipped. > > But in case of targeted reclaim things are different: > > cgroups, which parent exceeded its memory limit aren't special. > > > > 2) If memcg has no charged memory (memory usage is 0). In this > > case mem_cgroup_protected() always returns MEMCG_PROT_NONE, which > > is correct and prevents to generate fake memory low events for > > empty cgroups. But skipping memory emin/elow calculation is wrong: > > if there is no global memory pressure there might be no good > > chance again, so we can end up with effective guarantees set to 0 > > without any reason. > > Roman, so these two patches are on top of the min limit patches, right? > The fact that they come after just makes me feel this whole thing is not > completely thought through and I would like to see all 4 patch in one > series describing the whole design. We are getting really close to the > merge window and last minute updates makes me really nervouse. Can you > please repost the whole thing after the merge window, please? Hi, Michal! These changes are fixing some edge cases which I've discovered when I started writing unit tests for the memory controller (see in tools/testing/selftesting/cgroup/). All these edge cases are temporarily effects which exist only when there is no global memory pressure. We're already using my implementation in production for some time, and so far had no issues with it. Please note, that the existing implementation of memory.low has much more serious problems: it barely works without some significant configuration tweaks (e.g. set all memory.low in the hierarchy to max, except leaves), which are painful in production. I'm happy to discuss any concrete issues/concerns, but I really see no reasons to drop it from the mm tree now and start the discussion from scratch. Thank you!
On Mon 04-06-18 17:23:06, Roman Gushchin wrote: [...] > I'm happy to discuss any concrete issues/concerns, but I really see > no reasons to drop it from the mm tree now and start the discussion > from scratch. I do not think this is ready for the current merge window. Sorry! I would really prefer to see the whole thing in one series to have a better picture.
On Tue, Jun 05, 2018 at 11:03:49AM +0200, Michal Hocko wrote: > On Mon 04-06-18 17:23:06, Roman Gushchin wrote: > [...] > > I'm happy to discuss any concrete issues/concerns, but I really see > > no reasons to drop it from the mm tree now and start the discussion > > from scratch. > > I do not think this is ready for the current merge window. Sorry! I > would really prefer to see the whole thing in one series to have a > better picture. Please, provide any specific reason for that. I appreciate your opinion, but *I think* it's not an argument, seriously. We've discussed the patchset back to March and I made several iterations based on the received feedback. Later we had a separate discussion with Greg, who proposed an alternative solution, which, unfortunately, had some serious shortcomings. And, as I remember, some time ago we've discussed memory.min with you. And now you want to start from scratch without providing any reason. I find it counter-productive, sorry. Thanks!
On Tue 05-06-18 11:15:45, Roman Gushchin wrote: > On Tue, Jun 05, 2018 at 11:03:49AM +0200, Michal Hocko wrote: > > On Mon 04-06-18 17:23:06, Roman Gushchin wrote: > > [...] > > > I'm happy to discuss any concrete issues/concerns, but I really see > > > no reasons to drop it from the mm tree now and start the discussion > > > from scratch. > > > > I do not think this is ready for the current merge window. Sorry! I > > would really prefer to see the whole thing in one series to have a > > better picture. > > Please, provide any specific reason for that. I appreciate your opinion, > but *I think* it's not an argument, seriously. Seeing two follow up fixes close to the merge window just speaks for itself. Besides that there is not need to rush this now. > We've discussed the patchset back to March and I made several iterations > based on the received feedback. Later we had a separate discussion with Greg, > who proposed an alternative solution, which, unfortunately, had some serious > shortcomings. And, as I remember, some time ago we've discussed memory.min > with you. > And now you want to start from scratch without providing any reason. > I find it counter-productive, sorry. I am sorry I couldn't give it more time, but this release cycle was even crazier than usual.
diff --git a/mm/memcontrol.c b/mm/memcontrol.c index b9cd0bb63759..20c4f0a97d4c 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -5809,20 +5809,15 @@ enum mem_cgroup_protection mem_cgroup_protected(struct mem_cgroup *root, if (mem_cgroup_disabled()) return MEMCG_PROT_NONE; - if (!root) - root = root_mem_cgroup; - if (memcg == root) + if (memcg == root_mem_cgroup) return MEMCG_PROT_NONE; usage = page_counter_read(&memcg->memory); - if (!usage) - return MEMCG_PROT_NONE; - emin = memcg->memory.min; elow = memcg->memory.low; parent = parent_mem_cgroup(memcg); - if (parent == root) + if (parent == root_mem_cgroup) goto exit; parent_emin = READ_ONCE(parent->memory.emin); @@ -5857,6 +5852,12 @@ enum mem_cgroup_protection mem_cgroup_protected(struct mem_cgroup *root, memcg->memory.emin = emin; memcg->memory.elow = elow; + if (root && memcg == root) + return MEMCG_PROT_NONE; + + if (!usage) + return MEMCG_PROT_NONE; + if (usage <= emin) return MEMCG_PROT_MIN; else if (usage <= elow)
There are two cases when effective memory guarantee calculation is mistakenly skipped: 1) If memcg is a child of the root cgroup, and the root cgroup is not root_mem_cgroup (in other words, if the reclaim is targeted). Top-level memory cgroups are handled specially in mem_cgroup_protected(), because the root memory cgroup doesn't have memory guarantee and can't limit its children guarantees. So, all effective guarantee calculation is skipped. But in case of targeted reclaim things are different: cgroups, which parent exceeded its memory limit aren't special. 2) If memcg has no charged memory (memory usage is 0). In this case mem_cgroup_protected() always returns MEMCG_PROT_NONE, which is correct and prevents to generate fake memory low events for empty cgroups. But skipping memory emin/elow calculation is wrong: if there is no global memory pressure there might be no good chance again, so we can end up with effective guarantees set to 0 without any reason. Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Greg Thelen <gthelen@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> --- mm/memcontrol.c | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-)