Message ID | 20220308012047.26638-3-richard.weiyang@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [1/3] mm/memcg: mz already removed from rb_tree in mem_cgroup_largest_soft_limit_node() | expand |
On Tue 08-03-22 01:20:47, Wei Yang wrote: > next_mz is removed from rb_tree, let's add it back if no reclaim has > been tried. Could you elaborate more why we need/want this? > Signed-off-by: Wei Yang <richard.weiyang@gmail.com> > --- > mm/memcontrol.c | 7 ++++++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index 344a7e891bc5..e803ff02aae2 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -3493,8 +3493,13 @@ unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order, > loop > MEM_CGROUP_MAX_SOFT_LIMIT_RECLAIM_LOOPS)) > break; > } while (!nr_reclaimed); > - if (next_mz) > + if (next_mz) { > + spin_lock_irq(&mctz->lock); > + excess = soft_limit_excess(next_mz->memcg); > + __mem_cgroup_insert_exceeded(next_mz, mctz, excess); > + spin_unlock_irq(&mctz->lock); > css_put(&next_mz->memcg->css); > + } > return nr_reclaimed; > } > > -- > 2.33.1
On Tue, Mar 08, 2022 at 09:17:58AM +0100, Michal Hocko wrote: >On Tue 08-03-22 01:20:47, Wei Yang wrote: >> next_mz is removed from rb_tree, let's add it back if no reclaim has >> been tried. > >Could you elaborate more why we need/want this? > Per my understanding, we add back the right most node even reclaim makes no progress, so it is reasonable to add back a node if we didn't get a chance to do reclaim on it. It looks like we forget to add it back to the tree. Maybe Johannes know some background why we don't add it back?
[Cc Tim - the patch is http://lkml.kernel.org/r/20220308012047.26638-3-richard.weiyang@gmail.com] On Wed 09-03-22 00:46:20, Wei Yang wrote: > On Tue, Mar 08, 2022 at 09:17:58AM +0100, Michal Hocko wrote: > >On Tue 08-03-22 01:20:47, Wei Yang wrote: > >> next_mz is removed from rb_tree, let's add it back if no reclaim has > >> been tried. > > > >Could you elaborate more why we need/want this? > > > > Per my understanding, we add back the right most node even reclaim makes no > progress, so it is reasonable to add back a node if we didn't get a chance to > do reclaim on it. Your patch sounded familiar and I can remember now. The same fix has been posted by Tim last year https://lore.kernel.org/linux-mm/8d35206601ccf0e1fe021d24405b2a0c2f4e052f.1613584277.git.tim.c.chen@linux.intel.com/ It was posted with other changes to the soft limit code which I didn't like but I have acked this particular one. Not sure what has happened with it afterwards.
On Wed, Mar 09, 2022 at 02:48:45PM +0100, Michal Hocko wrote: >[Cc Tim - the patch is http://lkml.kernel.org/r/20220308012047.26638-3-richard.weiyang@gmail.com] > >On Wed 09-03-22 00:46:20, Wei Yang wrote: >> On Tue, Mar 08, 2022 at 09:17:58AM +0100, Michal Hocko wrote: >> >On Tue 08-03-22 01:20:47, Wei Yang wrote: >> >> next_mz is removed from rb_tree, let's add it back if no reclaim has >> >> been tried. >> > >> >Could you elaborate more why we need/want this? >> > >> >> Per my understanding, we add back the right most node even reclaim makes no >> progress, so it is reasonable to add back a node if we didn't get a chance to >> do reclaim on it. > >Your patch sounded familiar and I can remember now. The same fix has >been posted by Tim last year >https://lore.kernel.org/linux-mm/8d35206601ccf0e1fe021d24405b2a0c2f4e052f.1613584277.git.tim.c.chen@linux.intel.com/ >It was posted with other changes to the soft limit code which I didn't >like but I have acked this particular one. Not sure what has happened >with it afterwards. Because of this ? 4f09feb8bf: vm-scalability.throughput -4.3% regression https://lore.kernel.org/linux-mm/20210302062521.GB23892@xsang-OptiPlex-9020/ >-- >Michal Hocko >SUSE Labs
On Thu 10-03-22 01:13:50, Wei Yang wrote: > On Wed, Mar 09, 2022 at 02:48:45PM +0100, Michal Hocko wrote: > >[Cc Tim - the patch is http://lkml.kernel.org/r/20220308012047.26638-3-richard.weiyang@gmail.com] > > > >On Wed 09-03-22 00:46:20, Wei Yang wrote: > >> On Tue, Mar 08, 2022 at 09:17:58AM +0100, Michal Hocko wrote: > >> >On Tue 08-03-22 01:20:47, Wei Yang wrote: > >> >> next_mz is removed from rb_tree, let's add it back if no reclaim has > >> >> been tried. > >> > > >> >Could you elaborate more why we need/want this? > >> > > >> > >> Per my understanding, we add back the right most node even reclaim makes no > >> progress, so it is reasonable to add back a node if we didn't get a chance to > >> do reclaim on it. > > > >Your patch sounded familiar and I can remember now. The same fix has > >been posted by Tim last year > >https://lore.kernel.org/linux-mm/8d35206601ccf0e1fe021d24405b2a0c2f4e052f.1613584277.git.tim.c.chen@linux.intel.com/ > >It was posted with other changes to the soft limit code which I didn't > >like but I have acked this particular one. Not sure what has happened > >with it afterwards. > > Because of this ? > 4f09feb8bf: vm-scalability.throughput -4.3% regression > https://lore.kernel.org/linux-mm/20210302062521.GB23892@xsang-OptiPlex-9020/ That was a regression for a different patch in the series AFAICS: : FYI, we noticed a -4.3% regression of vm-scalability.throughput due to commit: : : commit: 4f09feb8bf083be3834080ddf3782aee12a7c3f7 ("mm: Force update of mem cgroup soft limit tree on usage excess") That patch has played with how often memcg_check_events is called and that can lead to a visible performance difference.
On Wed 09-03-22 14:48:46, Michal Hocko wrote: > [Cc Tim - the patch is http://lkml.kernel.org/r/20220308012047.26638-3-richard.weiyang@gmail.com] > > On Wed 09-03-22 00:46:20, Wei Yang wrote: > > On Tue, Mar 08, 2022 at 09:17:58AM +0100, Michal Hocko wrote: > > >On Tue 08-03-22 01:20:47, Wei Yang wrote: > > >> next_mz is removed from rb_tree, let's add it back if no reclaim has > > >> been tried. > > > > > >Could you elaborate more why we need/want this? > > > > > > > Per my understanding, we add back the right most node even reclaim makes no > > progress, so it is reasonable to add back a node if we didn't get a chance to > > do reclaim on it. > > Your patch sounded familiar and I can remember now. The same fix has > been posted by Tim last year > https://lore.kernel.org/linux-mm/8d35206601ccf0e1fe021d24405b2a0c2f4e052f.1613584277.git.tim.c.chen@linux.intel.com/ Btw. I forgot to mention yesterday. Whatever was the reason this has slipped through cracks it would great if you could reuse the changelog of the original patch which was more verbose and explicit about the underlying problem. The only remaining part I would add is a description of how serious the problem is. The removed memcg would be out of the excess tree until further memory charges would get it back. But that can take arbitrary amount of time. Whether that is a real problem would depend on the workload of course but considering how coarse of a tool the soft limit is it is possible that this is not something most users would even notice.
On Thu, Mar 10, 2022 at 09:59:30AM +0100, Michal Hocko wrote: >On Wed 09-03-22 14:48:46, Michal Hocko wrote: >> [Cc Tim - the patch is http://lkml.kernel.org/r/20220308012047.26638-3-richard.weiyang@gmail.com] >> >> On Wed 09-03-22 00:46:20, Wei Yang wrote: >> > On Tue, Mar 08, 2022 at 09:17:58AM +0100, Michal Hocko wrote: >> > >On Tue 08-03-22 01:20:47, Wei Yang wrote: >> > >> next_mz is removed from rb_tree, let's add it back if no reclaim has >> > >> been tried. >> > > >> > >Could you elaborate more why we need/want this? >> > > >> > >> > Per my understanding, we add back the right most node even reclaim makes no >> > progress, so it is reasonable to add back a node if we didn't get a chance to >> > do reclaim on it. >> >> Your patch sounded familiar and I can remember now. The same fix has >> been posted by Tim last year >> https://lore.kernel.org/linux-mm/8d35206601ccf0e1fe021d24405b2a0c2f4e052f.1613584277.git.tim.c.chen@linux.intel.com/ > >Btw. I forgot to mention yesterday. Whatever was the reason this has >slipped through cracks it would great if you could reuse the changelog >of the original patch which was more verbose and explicit about the >underlying problem. The only remaining part I would add is a description >of how serious the problem is. The removed memcg would be out of the >excess tree until further memory charges would get it back. But that can >take arbitrary amount of time. Whether that is a real problem would >depend on the workload of course but considering how coarse of a tool >the soft limit is it is possible that this is not something most users >would even notice. Got it, would send a v2. >-- >Michal Hocko >SUSE Labs
On Thu, Mar 10, 2022 at 09:53:59AM +0100, Michal Hocko wrote: >On Thu 10-03-22 01:13:50, Wei Yang wrote: >> On Wed, Mar 09, 2022 at 02:48:45PM +0100, Michal Hocko wrote: >> >[Cc Tim - the patch is http://lkml.kernel.org/r/20220308012047.26638-3-richard.weiyang@gmail.com] >> > >> >On Wed 09-03-22 00:46:20, Wei Yang wrote: >> >> On Tue, Mar 08, 2022 at 09:17:58AM +0100, Michal Hocko wrote: >> >> >On Tue 08-03-22 01:20:47, Wei Yang wrote: >> >> >> next_mz is removed from rb_tree, let's add it back if no reclaim has >> >> >> been tried. >> >> > >> >> >Could you elaborate more why we need/want this? >> >> > >> >> >> >> Per my understanding, we add back the right most node even reclaim makes no >> >> progress, so it is reasonable to add back a node if we didn't get a chance to >> >> do reclaim on it. >> > >> >Your patch sounded familiar and I can remember now. The same fix has >> >been posted by Tim last year >> >https://lore.kernel.org/linux-mm/8d35206601ccf0e1fe021d24405b2a0c2f4e052f.1613584277.git.tim.c.chen@linux.intel.com/ >> >It was posted with other changes to the soft limit code which I didn't >> >like but I have acked this particular one. Not sure what has happened >> >with it afterwards. >> >> Because of this ? >> 4f09feb8bf: vm-scalability.throughput -4.3% regression >> https://lore.kernel.org/linux-mm/20210302062521.GB23892@xsang-OptiPlex-9020/ > >That was a regression for a different patch in the series AFAICS: >: FYI, we noticed a -4.3% regression of vm-scalability.throughput due to commit: >: >: commit: 4f09feb8bf083be3834080ddf3782aee12a7c3f7 ("mm: Force update of mem cgroup soft limit tree on usage excess") > >That patch has played with how often memcg_check_events is called and >that can lead to a visible performance difference. Yes, I mean maybe because of this regression, the whole patch set is removed. >-- >Michal Hocko >SUSE Labs
diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 344a7e891bc5..e803ff02aae2 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -3493,8 +3493,13 @@ unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order, loop > MEM_CGROUP_MAX_SOFT_LIMIT_RECLAIM_LOOPS)) break; } while (!nr_reclaimed); - if (next_mz) + if (next_mz) { + spin_lock_irq(&mctz->lock); + excess = soft_limit_excess(next_mz->memcg); + __mem_cgroup_insert_exceeded(next_mz, mctz, excess); + spin_unlock_irq(&mctz->lock); css_put(&next_mz->memcg->css); + } return nr_reclaimed; }
next_mz is removed from rb_tree, let's add it back if no reclaim has been tried. Signed-off-by: Wei Yang <richard.weiyang@gmail.com> --- mm/memcontrol.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)