diff mbox series

[v2,3/3] mm/memcg: add next_mz back to soft limit tree if not reclaimed yet

Message ID 20220312071623.19050-3-richard.weiyang@gmail.com (mailing list archive)
State New
Headers show
Series [v2,1/3] mm/memcg: mz already removed from rb_tree in mem_cgroup_largest_soft_limit_node() | expand

Commit Message

Wei Yang March 12, 2022, 7:16 a.m. UTC
When memory reclaim failed for a maximum number of attempts and we bail
out of the reclaim loop, we forgot to put the target mem_cgroup chosen
for next reclaim back to the soft limit tree. This prevented pages in
the mem_cgroup from being reclaimed in the future even though the
mem_cgroup exceeded its soft limit.

Let's say there are two mem_cgroup and both of them exceed the soft
limit, while the first one is more active then the second. Since we add
a mem_cgroup to soft limit tree every 1024 event, the second one just
get a rare chance to be put on soft limit tree even it exceeds the
limit.

As time goes on, the first mem_cgroup was kept close to its soft limit
due to reclaim activities, while the memory usage of the second
mem_cgroup keeps growing over the soft limit for a long time due to its
relatively rare occurrence.

This patch adds next_mz back to prevent this sceanrio.

Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
---
 mm/memcontrol.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

Comments

Michal Hocko March 14, 2022, 9:41 a.m. UTC | #1
On Sat 12-03-22 07:16:23, Wei Yang wrote:
> When memory reclaim failed for a maximum number of attempts and we bail
> out of the reclaim loop, we forgot to put the target mem_cgroup chosen
> for next reclaim back to the soft limit tree. This prevented pages in
> the mem_cgroup from being reclaimed in the future even though the
> mem_cgroup exceeded its soft limit.
> 
> Let's say there are two mem_cgroup and both of them exceed the soft
> limit, while the first one is more active then the second. Since we add
> a mem_cgroup to soft limit tree every 1024 event, the second one just
> get a rare chance to be put on soft limit tree even it exceeds the
> limit.

yes, 1024 could be just 4MB of memory or 2GB if all the charged pages
are THPs. So the excess can build up considerably.

> As time goes on, the first mem_cgroup was kept close to its soft limit
> due to reclaim activities, while the memory usage of the second
> mem_cgroup keeps growing over the soft limit for a long time due to its
> relatively rare occurrence.
> 
> This patch adds next_mz back to prevent this sceanrio.
> 
> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>

Even though your changelog is different the change itself is identical to
https://lore.kernel.org/linux-mm/8d35206601ccf0e1fe021d24405b2a0c2f4e052f.1613584277.git.tim.c.chen@linux.intel.com/
In those cases I would preserve the the original authorship by
From: Tim Chen <tim.c.chen@linux.intel.com>
and add his s-o-b before yours.

Acked-by: Michal Hocko <mhocko@suse.com>

Thanks!

> ---
>  mm/memcontrol.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 344a7e891bc5..e803ff02aae2 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -3493,8 +3493,13 @@ unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order,
>  			loop > MEM_CGROUP_MAX_SOFT_LIMIT_RECLAIM_LOOPS))
>  			break;
>  	} while (!nr_reclaimed);
> -	if (next_mz)
> +	if (next_mz) {
> +		spin_lock_irq(&mctz->lock);
> +		excess = soft_limit_excess(next_mz->memcg);
> +		__mem_cgroup_insert_exceeded(next_mz, mctz, excess);
> +		spin_unlock_irq(&mctz->lock);
>  		css_put(&next_mz->memcg->css);
> +	}
>  	return nr_reclaimed;
>  }
>  
> -- 
> 2.33.1
Wei Yang March 14, 2022, 11:05 p.m. UTC | #2
On Mon, Mar 14, 2022 at 10:41:13AM +0100, Michal Hocko wrote:
>On Sat 12-03-22 07:16:23, Wei Yang wrote:
>> When memory reclaim failed for a maximum number of attempts and we bail
>> out of the reclaim loop, we forgot to put the target mem_cgroup chosen
>> for next reclaim back to the soft limit tree. This prevented pages in
>> the mem_cgroup from being reclaimed in the future even though the
>> mem_cgroup exceeded its soft limit.
>> 
>> Let's say there are two mem_cgroup and both of them exceed the soft
>> limit, while the first one is more active then the second. Since we add
>> a mem_cgroup to soft limit tree every 1024 event, the second one just
>> get a rare chance to be put on soft limit tree even it exceeds the
>> limit.
>
>yes, 1024 could be just 4MB of memory or 2GB if all the charged pages
>are THPs. So the excess can build up considerably.
>
>> As time goes on, the first mem_cgroup was kept close to its soft limit
>> due to reclaim activities, while the memory usage of the second
>> mem_cgroup keeps growing over the soft limit for a long time due to its
>> relatively rare occurrence.
>> 
>> This patch adds next_mz back to prevent this sceanrio.
>> 
>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>
>Even though your changelog is different the change itself is identical to
>https://lore.kernel.org/linux-mm/8d35206601ccf0e1fe021d24405b2a0c2f4e052f.1613584277.git.tim.c.chen@linux.intel.com/
>In those cases I would preserve the the original authorship by
>From: Tim Chen <tim.c.chen@linux.intel.com>
>and add his s-o-b before yours.

TBH I don't think this is fair.

I didn't see his original change before I sent this patch. This is a
coincidence we found the same point for improvement.

It hurts me if you want to change authorship. Well, if you really thinks this
is what it should be, please remove my s-o-b.

>
>Acked-by: Michal Hocko <mhocko@suse.com>
>
>Thanks!
Michal Hocko March 15, 2022, 8:54 a.m. UTC | #3
On Mon 14-03-22 23:05:48, Wei Yang wrote:
> On Mon, Mar 14, 2022 at 10:41:13AM +0100, Michal Hocko wrote:
> >On Sat 12-03-22 07:16:23, Wei Yang wrote:
> >> When memory reclaim failed for a maximum number of attempts and we bail
> >> out of the reclaim loop, we forgot to put the target mem_cgroup chosen
> >> for next reclaim back to the soft limit tree. This prevented pages in
> >> the mem_cgroup from being reclaimed in the future even though the
> >> mem_cgroup exceeded its soft limit.
> >> 
> >> Let's say there are two mem_cgroup and both of them exceed the soft
> >> limit, while the first one is more active then the second. Since we add
> >> a mem_cgroup to soft limit tree every 1024 event, the second one just
> >> get a rare chance to be put on soft limit tree even it exceeds the
> >> limit.
> >
> >yes, 1024 could be just 4MB of memory or 2GB if all the charged pages
> >are THPs. So the excess can build up considerably.
> >
> >> As time goes on, the first mem_cgroup was kept close to its soft limit
> >> due to reclaim activities, while the memory usage of the second
> >> mem_cgroup keeps growing over the soft limit for a long time due to its
> >> relatively rare occurrence.
> >> 
> >> This patch adds next_mz back to prevent this sceanrio.
> >> 
> >> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
> >
> >Even though your changelog is different the change itself is identical to
> >https://lore.kernel.org/linux-mm/8d35206601ccf0e1fe021d24405b2a0c2f4e052f.1613584277.git.tim.c.chen@linux.intel.com/
> >In those cases I would preserve the the original authorship by
> >From: Tim Chen <tim.c.chen@linux.intel.com>
> >and add his s-o-b before yours.
> 
> TBH I don't think this is fair.
> 
> I didn't see his original change before I sent this patch. This is a
> coincidence we found the same point for improvement.
> 
> It hurts me if you want to change authorship. Well, if you really thinks this
> is what it should be, please remove my s-o-b.

OK, fair enough.
diff mbox series

Patch

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 344a7e891bc5..e803ff02aae2 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3493,8 +3493,13 @@  unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order,
 			loop > MEM_CGROUP_MAX_SOFT_LIMIT_RECLAIM_LOOPS))
 			break;
 	} while (!nr_reclaimed);
-	if (next_mz)
+	if (next_mz) {
+		spin_lock_irq(&mctz->lock);
+		excess = soft_limit_excess(next_mz->memcg);
+		__mem_cgroup_insert_exceeded(next_mz, mctz, excess);
+		spin_unlock_irq(&mctz->lock);
 		css_put(&next_mz->memcg->css);
+	}
 	return nr_reclaimed;
 }