diff mbox series

memcg: Don't generate low/min events if either low/min or elow/emin is 0

Message ID 20250403031212.317837-1-longman@redhat.com (mailing list archive)
State New
Headers show
Series memcg: Don't generate low/min events if either low/min or elow/emin is 0 | expand

Commit Message

Waiman Long April 3, 2025, 3:12 a.m. UTC
The test_memcontrol selftest consistently fails its test_memcg_low
sub-test because of the fact that two of its test child cgroups which
have a memmory.low of 0 or an effective memory.low of 0 still have low
events generated for them since mem_cgroup_below_low() use the ">="
operator when comparing to elow.

The simple fix of changing the operator to ">", however, changes the
way memory reclaim works quite drastically leading to other failures.
So we can't do that without some relatively riskier changes in memory
reclaim.

Another simpler alternative is to avoid reporting below_low failure
if either memory.low or its effective equivalent is 0 which is done
by this patch.

With this patch applied, the test_memcg_low sub-test finishes
successfully without failure in most cases. Though both test_memcg_low
and test_memcg_min sub-tests may fail occasionally if the memory.current
values fall outside of the expected ranges.

To be consistent, similar change is appled to mem_cgroup_below_min()
as well.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 include/linux/memcontrol.h | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

Comments

Andrew Morton April 3, 2025, 4:15 a.m. UTC | #1
On Wed,  2 Apr 2025 23:12:12 -0400 Waiman Long <longman@redhat.com> wrote:

> The test_memcontrol selftest consistently fails its test_memcg_low
> sub-test because of the fact that two of its test child cgroups which
> have a memmory.low of 0 or an effective memory.low of 0 still have low
> events generated for them since mem_cgroup_below_low() use the ">="
> operator when comparing to elow.
> 
> The simple fix of changing the operator to ">", however, changes the
> way memory reclaim works quite drastically leading to other failures.
> So we can't do that without some relatively riskier changes in memory
> reclaim.
> 
> Another simpler alternative is to avoid reporting below_low failure
> if either memory.low or its effective equivalent is 0 which is done
> by this patch.
> 
> With this patch applied, the test_memcg_low sub-test finishes
> successfully without failure in most cases. Though both test_memcg_low
> and test_memcg_min sub-tests may fail occasionally if the memory.current
> values fall outside of the expected ranges.
> 

Well, maybe the selftest needs to be changed?

Please describe this patch in terms of "what is wrong with the code at
present" and "how that is fixed" and "what is the impact upon
userspace".

Is this change backwardly compatible with existing userspace?

> To be consistent, similar change is appled to mem_cgroup_below_min()
> as well.

Ditto.
Waiman Long April 3, 2025, 2:03 p.m. UTC | #2
On 4/3/25 12:15 AM, Andrew Morton wrote:
> On Wed,  2 Apr 2025 23:12:12 -0400 Waiman Long <longman@redhat.com> wrote:
>
>> The test_memcontrol selftest consistently fails its test_memcg_low
>> sub-test because of the fact that two of its test child cgroups which
>> have a memmory.low of 0 or an effective memory.low of 0 still have low
>> events generated for them since mem_cgroup_below_low() use the ">="
>> operator when comparing to elow.
>>
>> The simple fix of changing the operator to ">", however, changes the
>> way memory reclaim works quite drastically leading to other failures.
>> So we can't do that without some relatively riskier changes in memory
>> reclaim.
>>
>> Another simpler alternative is to avoid reporting below_low failure
>> if either memory.low or its effective equivalent is 0 which is done
>> by this patch.
>>
>> With this patch applied, the test_memcg_low sub-test finishes
>> successfully without failure in most cases. Though both test_memcg_low
>> and test_memcg_min sub-tests may fail occasionally if the memory.current
>> values fall outside of the expected ranges.
>>
> Well, maybe the selftest needs to be changed?
Yes, probably some minor adjustment to prevent sporadic failures as much 
as possible. Will look at that.
>
> Please describe this patch in terms of "what is wrong with the code at
> present" and "how that is fixed" and "what is the impact upon
> userspace".
Will do.
>
> Is this change backwardly compatible with existing userspace?

I doubt there will be much impact. There are two cases where the 
behavior will be different.

First of all, if the user doesn't explictly set low/min, it will remain 
0. However, the low/min events may have non-zero value if memory reclaim 
is happening around it. That is certainly unexpected by the users. I 
doubt users will have dependency on a non-zero low/min event count 
because that may or may not happen.

The second case is when we set up an empty cgroup with no task in it. 
The low/min value can be set, but the effective low/min value will be 0. 
Again, low/min events may be triggered and I doubt users will be 
expecting that.

Cheers,
Longman
diff mbox series

Patch

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 53364526d877..4d4a1f159eaa 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -601,21 +601,31 @@  static inline bool mem_cgroup_unprotected(struct mem_cgroup *target,
 static inline bool mem_cgroup_below_low(struct mem_cgroup *target,
 					struct mem_cgroup *memcg)
 {
+	unsigned long elow;
+
 	if (mem_cgroup_unprotected(target, memcg))
 		return false;
 
-	return READ_ONCE(memcg->memory.elow) >=
-		page_counter_read(&memcg->memory);
+	elow = READ_ONCE(memcg->memory.elow);
+	if (!elow || !READ_ONCE(memcg->memory.low))
+		return false;
+
+	return page_counter_read(&memcg->memory) <= elow;
 }
 
 static inline bool mem_cgroup_below_min(struct mem_cgroup *target,
 					struct mem_cgroup *memcg)
 {
+	unsigned long emin;
+
 	if (mem_cgroup_unprotected(target, memcg))
 		return false;
 
-	return READ_ONCE(memcg->memory.emin) >=
-		page_counter_read(&memcg->memory);
+	emin = READ_ONCE(memcg->memory.emin);
+	if (!emin || !READ_ONCE(memcg->memory.min))
+		return false;
+
+	return page_counter_read(&memcg->memory) <= emin;
 }
 
 int __mem_cgroup_charge(struct folio *folio, struct mm_struct *mm, gfp_t gfp);