Message ID | 3b6e4e9aa8b3ee1466269baf23ed82d90a8f791c.1612902157.git.tim.c.chen@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Soft limit memory management bug fixes | expand |
Hello Tim, On Tue, Feb 09, 2021 at 12:29:47PM -0800, Tim Chen wrote: > @@ -6849,7 +6850,9 @@ static void uncharge_page(struct page *page, struct uncharge_gather *ug) > * exclusive access to the page. > */ > > - if (ug->memcg != page_memcg(page)) { > + if (ug->memcg != page_memcg(page) || > + /* uncharge batch update soft limit tree on a node basis */ > + (ug->dummy_page && ug->nid != page_to_nid(page))) { The fix makes sense to me. However, unconditionally breaking up the batch by node can unnecessarily regress workloads in cgroups that do not have a soft limit configured, and cgroup2 which doesn't have soft limits at all. Consider an interleaving allocation policy for example. Can you please further gate on memcg->soft_limit != PAGE_COUNTER_MAX, or at least on !cgroup_subsys_on_dfl(memory_cgrp_subsys)? Thanks
On 2/9/21 2:22 PM, Johannes Weiner wrote: > Hello Tim, > > On Tue, Feb 09, 2021 at 12:29:47PM -0800, Tim Chen wrote: >> @@ -6849,7 +6850,9 @@ static void uncharge_page(struct page *page, struct uncharge_gather *ug) >> * exclusive access to the page. >> */ >> >> - if (ug->memcg != page_memcg(page)) { >> + if (ug->memcg != page_memcg(page) || >> + /* uncharge batch update soft limit tree on a node basis */ >> + (ug->dummy_page && ug->nid != page_to_nid(page))) { > > The fix makes sense to me. > > However, unconditionally breaking up the batch by node can > unnecessarily regress workloads in cgroups that do not have a soft > limit configured, and cgroup2 which doesn't have soft limits at > all. Consider an interleaving allocation policy for example. > > Can you please further gate on memcg->soft_limit != PAGE_COUNTER_MAX, > or at least on !cgroup_subsys_on_dfl(memory_cgrp_subsys)? > Sure. Will fix this. Tim
On Tue 09-02-21 12:29:47, Tim Chen wrote: > On a per node basis, the mem cgroup soft limit tree on each node tracks > how much a cgroup has exceeded its soft limit memory limit and sorts > the cgroup by its excess usage. On page release, the trees are not > updated right away, until we have gathered a batch of pages belonging to > the same cgroup. This reduces the frequency of updating the soft limit tree > and locking of the tree and associated cgroup. > > However, the batch of pages could contain pages from multiple nodes but > only the soft limit tree from one node would get updated. Change the > logic so that we update the tree in batch of pages, with each batch of > pages all in the same mem cgroup and memory node. An update is issued for > the batch of pages of a node collected till now whenever we encounter > a page belonging to a different node. I do agree with Johannes here. This shouldn't be done unconditionally for all memcgs. Wouldn't it be much better to do the fix up in the mem_cgroup_soft_reclaim path instead. Simply check the excess before doing any reclaim? Btw. have you seen this triggering a noticeable misbehaving? I would expect this to have a rather small effect considering how many sources of memcg_check_events we have. Unless I have missed something this has been introduced by 747db954cab6 ("mm: memcontrol: use page lists for uncharge batching"). Please add Fixes tag as well if this is really worth fixing. > Reviewed-by: Ying Huang <ying.huang@intel.com> > Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> > --- > mm/memcontrol.c | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index d72449eeb85a..f5a4a0e4e2ec 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -6804,6 +6804,7 @@ struct uncharge_gather { > unsigned long pgpgout; > unsigned long nr_kmem; > struct page *dummy_page; > + int nid; > }; > > static inline void uncharge_gather_clear(struct uncharge_gather *ug) > @@ -6849,7 +6850,9 @@ static void uncharge_page(struct page *page, struct uncharge_gather *ug) > * exclusive access to the page. > */ > > - if (ug->memcg != page_memcg(page)) { > + if (ug->memcg != page_memcg(page) || > + /* uncharge batch update soft limit tree on a node basis */ > + (ug->dummy_page && ug->nid != page_to_nid(page))) { > if (ug->memcg) { > uncharge_batch(ug); > uncharge_gather_clear(ug); > @@ -6869,6 +6872,7 @@ static void uncharge_page(struct page *page, struct uncharge_gather *ug) > ug->pgpgout++; > > ug->dummy_page = page; > + ug->nid = page_to_nid(page); > page->memcg_data = 0; > css_put(&ug->memcg->css); > } > -- > 2.20.1
diff --git a/mm/memcontrol.c b/mm/memcontrol.c index d72449eeb85a..f5a4a0e4e2ec 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -6804,6 +6804,7 @@ struct uncharge_gather { unsigned long pgpgout; unsigned long nr_kmem; struct page *dummy_page; + int nid; }; static inline void uncharge_gather_clear(struct uncharge_gather *ug) @@ -6849,7 +6850,9 @@ static void uncharge_page(struct page *page, struct uncharge_gather *ug) * exclusive access to the page. */ - if (ug->memcg != page_memcg(page)) { + if (ug->memcg != page_memcg(page) || + /* uncharge batch update soft limit tree on a node basis */ + (ug->dummy_page && ug->nid != page_to_nid(page))) { if (ug->memcg) { uncharge_batch(ug); uncharge_gather_clear(ug); @@ -6869,6 +6872,7 @@ static void uncharge_page(struct page *page, struct uncharge_gather *ug) ug->pgpgout++; ug->dummy_page = page; + ug->nid = page_to_nid(page); page->memcg_data = 0; css_put(&ug->memcg->css); }