Message ID | 20200412140427.6732-1-laoar.shao@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | mm, memcg: fix inconsistent oom event behavior | expand |
On Sun, Apr 12, 2020 at 7:04 AM Yafang Shao <laoar.shao@gmail.com> wrote: > > A recent commit 9852ae3fe529 ("mm, memcg: consider subtrees in > memory.events") changes the behavior of memcg events, which will > consider subtrees in memory.events. But oom_kill event is a special one > as it is used in both cgroup1 and cgroup2. In cgroup1, it is displayed > in memory.oom_control. The file memory.oom_control is in both root memcg > and non root memcg, that is different with memory.event as it only in > non-root memcg. That commit is okay for cgroup2, but it is not okay for > cgroup1 as it will cause inconsistent behavior between root memcg and > non-root memcg. I still couldn't understand the cgroup v1's root vs non_root behavior change. The behavior change I see is the hierarchical one i.e. MEMCG_OOM_KILL event in the descendant will cause the notification and count increment in the ancestors even in the cgroup v1. I suppose we don't want that behavior change in v1. > Let's recover the original behavior for cgroup1. > > Fixes: 9852ae3fe529 ("mm, memcg: consider subtrees in memory.events") > Cc: Chris Down <chris@chrisdown.name> > Cc: Johannes Weiner <hannes@cmpxchg.org> > Cc: Shakeel Butt <shakeelb@google.com> > Cc: stable@vger.kernel.org > Signed-off-by: Yafang Shao <laoar.shao@gmail.com> > --- > include/linux/memcontrol.h | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h > index 8c340e6b347f..a0ae080a67d1 100644 > --- a/include/linux/memcontrol.h > +++ b/include/linux/memcontrol.h > @@ -798,7 +798,8 @@ static inline void memcg_memory_event(struct mem_cgroup *memcg, > atomic_long_inc(&memcg->memory_events[event]); > cgroup_file_notify(&memcg->events_file); > > - if (cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_LOCAL_EVENTS) > + if (cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_LOCAL_EVENTS || > + !cgroup_subsys_on_dfl(memory_cgrp_subsys)) > break; > } while ((memcg = parent_mem_cgroup(memcg)) && > !mem_cgroup_is_root(memcg)); > -- > 2.18.2 >
Hi Yafang, Yafang Shao writes: >A recent commit 9852ae3fe529 ("mm, memcg: consider subtrees in >memory.events") changes the behavior of memcg events, which will >consider subtrees in memory.events. But oom_kill event is a special one >as it is used in both cgroup1 and cgroup2. In cgroup1, it is displayed >in memory.oom_control. The file memory.oom_control is in both root memcg >and non root memcg, that is different with memory.event as it only in >non-root memcg. That commit is okay for cgroup2, but it is not okay for >cgroup1 as it will cause inconsistent behavior between root memcg and >non-root memcg. >Let's recover the original behavior for cgroup1. Can you please explain the practical ramifications of this and show an explicitly laid out example of how this manifests, with numbers and scenarios? It's unclear to me that this is a real problem as is -- it may be, but there certainly needs to be more information. Thanks, Chris
On Tue, Apr 14, 2020 at 1:06 AM Shakeel Butt <shakeelb@google.com> wrote: > > On Sun, Apr 12, 2020 at 7:04 AM Yafang Shao <laoar.shao@gmail.com> wrote: > > > > A recent commit 9852ae3fe529 ("mm, memcg: consider subtrees in > > memory.events") changes the behavior of memcg events, which will > > consider subtrees in memory.events. But oom_kill event is a special one > > as it is used in both cgroup1 and cgroup2. In cgroup1, it is displayed > > in memory.oom_control. The file memory.oom_control is in both root memcg > > and non root memcg, that is different with memory.event as it only in > > non-root memcg. That commit is okay for cgroup2, but it is not okay for > > cgroup1 as it will cause inconsistent behavior between root memcg and > > non-root memcg. > > I still couldn't understand the cgroup v1's root vs non_root behavior > change. The behavior change I see is the hierarchical one i.e. > MEMCG_OOM_KILL event in the descendant will cause the notification and > count increment in the ancestors even in the cgroup v1. For the non-root memcg, its memory.oom_control(oom_kill) includes its descendants' oom_kill, but for root memcg, it doesn't include its descendants' oom_kill. That means, memory.oom_control(oom_kill) has different meanings in different memcgs. That is inconsistent. [snip] > I suppose we > don't want that behavior change in v1. > That is another topic. I think this feature is welcomed to cgroup1, if we can fully support it, for example by adding memory.events.local into cgroup1 as well, but as far as I know the cgroup1 is frozen. > > Let's recover the original behavior for cgroup1. > > > > Fixes: 9852ae3fe529 ("mm, memcg: consider subtrees in memory.events") > > Cc: Chris Down <chris@chrisdown.name> > > Cc: Johannes Weiner <hannes@cmpxchg.org> > > Cc: Shakeel Butt <shakeelb@google.com> > > Cc: stable@vger.kernel.org > > Signed-off-by: Yafang Shao <laoar.shao@gmail.com> > > --- > > include/linux/memcontrol.h | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h > > index 8c340e6b347f..a0ae080a67d1 100644 > > --- a/include/linux/memcontrol.h > > +++ b/include/linux/memcontrol.h > > @@ -798,7 +798,8 @@ static inline void memcg_memory_event(struct mem_cgroup *memcg, > > atomic_long_inc(&memcg->memory_events[event]); > > cgroup_file_notify(&memcg->events_file); > > > > - if (cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_LOCAL_EVENTS) > > + if (cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_LOCAL_EVENTS || > > + !cgroup_subsys_on_dfl(memory_cgrp_subsys)) > > break; > > } while ((memcg = parent_mem_cgroup(memcg)) && > > !mem_cgroup_is_root(memcg)); > > -- > > 2.18.2 > > Thanks Yafang
On Tue, Apr 14, 2020 at 3:31 AM Chris Down <chris@chrisdown.name> wrote: > > Hi Yafang, > > Yafang Shao writes: > >A recent commit 9852ae3fe529 ("mm, memcg: consider subtrees in > >memory.events") changes the behavior of memcg events, which will > >consider subtrees in memory.events. But oom_kill event is a special one > >as it is used in both cgroup1 and cgroup2. In cgroup1, it is displayed > >in memory.oom_control. The file memory.oom_control is in both root memcg > >and non root memcg, that is different with memory.event as it only in > >non-root memcg. That commit is okay for cgroup2, but it is not okay for > >cgroup1 as it will cause inconsistent behavior between root memcg and > >non-root memcg. > >Let's recover the original behavior for cgroup1. > > Can you please explain the practical ramifications of this and show an > explicitly laid out example of how this manifests, with numbers and scenarios? > It's unclear to me that this is a real problem as is -- it may be, but there > certainly needs to be more information. > Here's an example. root memcg / memcg foo / memcg bar Suppose there's an oom_kill in memcg bar, then the oon_kill will be root memcg : memory.oom_control(oom_kill) 0 / memcg foo : memory.oom_control(oom_kill) 1 / memcg bar : memory.oom_control(oom_kill) 1 Then the user has to know whether the memcg is root or not, if it is root memcg, then memory.oom_control(oom_kill) is its local event only, while if it is not root memcg, then memory.oom_control(oom_kill) includes all its descendants' oom_kill events. Thanks Yafang
On Mon, Apr 13, 2020 at 5:36 PM Yafang Shao <laoar.shao@gmail.com> wrote: > > On Tue, Apr 14, 2020 at 1:06 AM Shakeel Butt <shakeelb@google.com> wrote: > > > > On Sun, Apr 12, 2020 at 7:04 AM Yafang Shao <laoar.shao@gmail.com> wrote: > > > > > > A recent commit 9852ae3fe529 ("mm, memcg: consider subtrees in > > > memory.events") changes the behavior of memcg events, which will > > > consider subtrees in memory.events. But oom_kill event is a special one > > > as it is used in both cgroup1 and cgroup2. In cgroup1, it is displayed > > > in memory.oom_control. The file memory.oom_control is in both root memcg > > > and non root memcg, that is different with memory.event as it only in > > > non-root memcg. That commit is okay for cgroup2, but it is not okay for > > > cgroup1 as it will cause inconsistent behavior between root memcg and > > > non-root memcg. > > > > I still couldn't understand the cgroup v1's root vs non_root behavior > > change. The behavior change I see is the hierarchical one i.e. > > MEMCG_OOM_KILL event in the descendant will cause the notification and > > count increment in the ancestors even in the cgroup v1. > > For the non-root memcg, its memory.oom_control(oom_kill) includes its > descendants' oom_kill, but for root memcg, it doesn't include its > descendants' oom_kill. That means, memory.oom_control(oom_kill) has > different meanings in different memcgs. That is inconsistent. > > [snip] > > I suppose we > > don't want that behavior change in v1. > > > > That is another topic. I think this feature is welcomed to cgroup1, if > we can fully support it, for example by adding memory.events.local > into cgroup1 as well, but as far as I know the cgroup1 is frozen. > Please note that after your patch the non-root memcg's memory.oom_control(oom_kill) will not include the descendant's oom_kill anymore. The non-root and root memcg's memory.oom_control(oom_kill) will not be hierarchical anymore but consistent. I think that was the intention of the patch, right? > > > Let's recover the original behavior for cgroup1. > > > > > > Fixes: 9852ae3fe529 ("mm, memcg: consider subtrees in memory.events") > > > Cc: Chris Down <chris@chrisdown.name> > > > Cc: Johannes Weiner <hannes@cmpxchg.org> > > > Cc: Shakeel Butt <shakeelb@google.com> > > > Cc: stable@vger.kernel.org > > > Signed-off-by: Yafang Shao <laoar.shao@gmail.com> > > > --- > > > include/linux/memcontrol.h | 3 ++- > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h > > > index 8c340e6b347f..a0ae080a67d1 100644 > > > --- a/include/linux/memcontrol.h > > > +++ b/include/linux/memcontrol.h > > > @@ -798,7 +798,8 @@ static inline void memcg_memory_event(struct mem_cgroup *memcg, > > > atomic_long_inc(&memcg->memory_events[event]); > > > cgroup_file_notify(&memcg->events_file); > > > > > > - if (cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_LOCAL_EVENTS) > > > + if (cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_LOCAL_EVENTS || > > > + !cgroup_subsys_on_dfl(memory_cgrp_subsys)) > > > break; > > > } while ((memcg = parent_mem_cgroup(memcg)) && > > > !mem_cgroup_is_root(memcg)); > > > -- > > > 2.18.2 > > > > > > Thanks > Yafang
On Tue, Apr 14, 2020 at 8:53 AM Shakeel Butt <shakeelb@google.com> wrote: > > On Mon, Apr 13, 2020 at 5:36 PM Yafang Shao <laoar.shao@gmail.com> wrote: > > > > On Tue, Apr 14, 2020 at 1:06 AM Shakeel Butt <shakeelb@google.com> wrote: > > > > > > On Sun, Apr 12, 2020 at 7:04 AM Yafang Shao <laoar.shao@gmail.com> wrote: > > > > > > > > A recent commit 9852ae3fe529 ("mm, memcg: consider subtrees in > > > > memory.events") changes the behavior of memcg events, which will > > > > consider subtrees in memory.events. But oom_kill event is a special one > > > > as it is used in both cgroup1 and cgroup2. In cgroup1, it is displayed > > > > in memory.oom_control. The file memory.oom_control is in both root memcg > > > > and non root memcg, that is different with memory.event as it only in > > > > non-root memcg. That commit is okay for cgroup2, but it is not okay for > > > > cgroup1 as it will cause inconsistent behavior between root memcg and > > > > non-root memcg. > > > > > > I still couldn't understand the cgroup v1's root vs non_root behavior > > > change. The behavior change I see is the hierarchical one i.e. > > > MEMCG_OOM_KILL event in the descendant will cause the notification and > > > count increment in the ancestors even in the cgroup v1. > > > > For the non-root memcg, its memory.oom_control(oom_kill) includes its > > descendants' oom_kill, but for root memcg, it doesn't include its > > descendants' oom_kill. That means, memory.oom_control(oom_kill) has > > different meanings in different memcgs. That is inconsistent. > > > > [snip] > > > I suppose we > > > don't want that behavior change in v1. > > > > > > > That is another topic. I think this feature is welcomed to cgroup1, if > > we can fully support it, for example by adding memory.events.local > > into cgroup1 as well, but as far as I know the cgroup1 is frozen. > > > > Please note that after your patch the non-root memcg's > memory.oom_control(oom_kill) will not include the descendant's > oom_kill anymore. The non-root and root memcg's > memory.oom_control(oom_kill) will not be hierarchical anymore but > consistent. I think that was the intention of the patch, right? > Right. If we can't fully support it in cgroup1, then let's don't touch its original behavior. > > > > Let's recover the original behavior for cgroup1. > > > > > > > > Fixes: 9852ae3fe529 ("mm, memcg: consider subtrees in memory.events") > > > > Cc: Chris Down <chris@chrisdown.name> > > > > Cc: Johannes Weiner <hannes@cmpxchg.org> > > > > Cc: Shakeel Butt <shakeelb@google.com> > > > > Cc: stable@vger.kernel.org > > > > Signed-off-by: Yafang Shao <laoar.shao@gmail.com> > > > > --- > > > > include/linux/memcontrol.h | 3 ++- > > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h > > > > index 8c340e6b347f..a0ae080a67d1 100644 > > > > --- a/include/linux/memcontrol.h > > > > +++ b/include/linux/memcontrol.h > > > > @@ -798,7 +798,8 @@ static inline void memcg_memory_event(struct mem_cgroup *memcg, > > > > atomic_long_inc(&memcg->memory_events[event]); > > > > cgroup_file_notify(&memcg->events_file); > > > > > > > > - if (cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_LOCAL_EVENTS) > > > > + if (cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_LOCAL_EVENTS || > > > > + !cgroup_subsys_on_dfl(memory_cgrp_subsys)) > > > > break; > > > > } while ((memcg = parent_mem_cgroup(memcg)) && > > > > !mem_cgroup_is_root(memcg)); > > > > -- > > > > 2.18.2 > > > > > > > > Thanks Yafang
On Mon, Apr 13, 2020 at 5:58 PM Yafang Shao <laoar.shao@gmail.com> wrote: > > On Tue, Apr 14, 2020 at 8:53 AM Shakeel Butt <shakeelb@google.com> wrote: > > > > On Mon, Apr 13, 2020 at 5:36 PM Yafang Shao <laoar.shao@gmail.com> wrote: > > > > > > On Tue, Apr 14, 2020 at 1:06 AM Shakeel Butt <shakeelb@google.com> wrote: > > > > > > > > On Sun, Apr 12, 2020 at 7:04 AM Yafang Shao <laoar.shao@gmail.com> wrote: > > > > > > > > > > A recent commit 9852ae3fe529 ("mm, memcg: consider subtrees in > > > > > memory.events") changes the behavior of memcg events, which will > > > > > consider subtrees in memory.events. But oom_kill event is a special one > > > > > as it is used in both cgroup1 and cgroup2. In cgroup1, it is displayed > > > > > in memory.oom_control. The file memory.oom_control is in both root memcg > > > > > and non root memcg, that is different with memory.event as it only in > > > > > non-root memcg. That commit is okay for cgroup2, but it is not okay for > > > > > cgroup1 as it will cause inconsistent behavior between root memcg and > > > > > non-root memcg. > > > > > > > > I still couldn't understand the cgroup v1's root vs non_root behavior > > > > change. The behavior change I see is the hierarchical one i.e. > > > > MEMCG_OOM_KILL event in the descendant will cause the notification and > > > > count increment in the ancestors even in the cgroup v1. > > > > > > For the non-root memcg, its memory.oom_control(oom_kill) includes its > > > descendants' oom_kill, but for root memcg, it doesn't include its > > > descendants' oom_kill. That means, memory.oom_control(oom_kill) has > > > different meanings in different memcgs. That is inconsistent. > > > > > > [snip] > > > > I suppose we > > > > don't want that behavior change in v1. > > > > > > > > > > That is another topic. I think this feature is welcomed to cgroup1, if > > > we can fully support it, for example by adding memory.events.local > > > into cgroup1 as well, but as far as I know the cgroup1 is frozen. > > > > > > > Please note that after your patch the non-root memcg's > > memory.oom_control(oom_kill) will not include the descendant's > > oom_kill anymore. The non-root and root memcg's > > memory.oom_control(oom_kill) will not be hierarchical anymore but > > consistent. I think that was the intention of the patch, right? > > > > Right. If we can't fully support it in cgroup1, then let's don't touch > its original behavior. > Agreed. > > > > > Let's recover the original behavior for cgroup1. > > > > > > > > > > Fixes: 9852ae3fe529 ("mm, memcg: consider subtrees in memory.events") > > > > > Cc: Chris Down <chris@chrisdown.name> > > > > > Cc: Johannes Weiner <hannes@cmpxchg.org> > > > > > Cc: Shakeel Butt <shakeelb@google.com> > > > > > Cc: stable@vger.kernel.org > > > > > Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Reviewed-by: Shakeel Butt <shakeelb@google.com> > > > > > --- > > > > > include/linux/memcontrol.h | 3 ++- > > > > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > > > > > > > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h > > > > > index 8c340e6b347f..a0ae080a67d1 100644 > > > > > --- a/include/linux/memcontrol.h > > > > > +++ b/include/linux/memcontrol.h > > > > > @@ -798,7 +798,8 @@ static inline void memcg_memory_event(struct mem_cgroup *memcg, > > > > > atomic_long_inc(&memcg->memory_events[event]); > > > > > cgroup_file_notify(&memcg->events_file); > > > > > > > > > > - if (cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_LOCAL_EVENTS) > > > > > + if (cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_LOCAL_EVENTS || > > > > > + !cgroup_subsys_on_dfl(memory_cgrp_subsys)) > > > > > break; > > > > > } while ((memcg = parent_mem_cgroup(memcg)) && > > > > > !mem_cgroup_is_root(memcg)); > > > > > -- > > > > > 2.18.2 > > > > > > > > > > > > > > Thanks > Yafang
To be clear, you're correct that this wasn't intended to result in any changes on cgroup v1, so I'm not against the change. Especially for stable, though, I'd like to understand what the real results and ramifications are here.
On Wed, Apr 15, 2020 at 2:19 AM Chris Down <chris@chrisdown.name> wrote: > > To be clear, you're correct that this wasn't intended to result in any changes > on cgroup v1, so I'm not against the change. Especially for stable, though, I'd > like to understand what the real results and ramifications are here. As explained above, the user tool parsing memory.oom_control is affected by this behavioral change, and what's worse is there is no documentation on it. I'm not agaist it if we think that is not enough to cc:stable. Thanks Yafang
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 8c340e6b347f..a0ae080a67d1 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -798,7 +798,8 @@ static inline void memcg_memory_event(struct mem_cgroup *memcg, atomic_long_inc(&memcg->memory_events[event]); cgroup_file_notify(&memcg->events_file); - if (cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_LOCAL_EVENTS) + if (cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_LOCAL_EVENTS || + !cgroup_subsys_on_dfl(memory_cgrp_subsys)) break; } while ((memcg = parent_mem_cgroup(memcg)) && !mem_cgroup_is_root(memcg));
A recent commit 9852ae3fe529 ("mm, memcg: consider subtrees in memory.events") changes the behavior of memcg events, which will consider subtrees in memory.events. But oom_kill event is a special one as it is used in both cgroup1 and cgroup2. In cgroup1, it is displayed in memory.oom_control. The file memory.oom_control is in both root memcg and non root memcg, that is different with memory.event as it only in non-root memcg. That commit is okay for cgroup2, but it is not okay for cgroup1 as it will cause inconsistent behavior between root memcg and non-root memcg. Let's recover the original behavior for cgroup1. Fixes: 9852ae3fe529 ("mm, memcg: consider subtrees in memory.events") Cc: Chris Down <chris@chrisdown.name> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Shakeel Butt <shakeelb@google.com> Cc: stable@vger.kernel.org Signed-off-by: Yafang Shao <laoar.shao@gmail.com> --- include/linux/memcontrol.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)