diff mbox series

[next,-v1,3/5] memcg: simplify the mem_cgroup_update_lru_size function

Message ID 20241206013512.2883617-4-chenridong@huaweicloud.com (mailing list archive)
State New
Headers show
Series Some cleanup for memcg | expand

Commit Message

Chen Ridong Dec. 6, 2024, 1:35 a.m. UTC
From: Chen Ridong <chenridong@huawei.com>

In the `mem_cgroup_update_lru_size` function, the `lru_size` should be
updated by adding `nr_pages` regardless of whether `nr_pages` is greater
than 0 or less than 0. To simplify this function, add a check for
`nr_pages` == 0. When `nr_pages` is not equal to 0, perform the same
actions.

Signed-off-by: Chen Ridong <chenridong@huawei.com>
---
 mm/memcontrol.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

Comments

Yu Zhao Dec. 6, 2024, 5:33 a.m. UTC | #1
On Thu, Dec 5, 2024 at 6:45 PM Chen Ridong <chenridong@huaweicloud.com> wrote:
>
> From: Chen Ridong <chenridong@huawei.com>
>
> In the `mem_cgroup_update_lru_size` function, the `lru_size` should be
> updated by adding `nr_pages` regardless of whether `nr_pages` is greater
> than 0 or less than 0. To simplify this function, add a check for
> `nr_pages` == 0. When `nr_pages` is not equal to 0, perform the same
> actions.
>
> Signed-off-by: Chen Ridong <chenridong@huawei.com>

NAK.

The commit that added that clearly explains why it was done that way.
Chen Ridong Dec. 6, 2024, 6:40 a.m. UTC | #2
On 2024/12/6 13:33, Yu Zhao wrote:
> On Thu, Dec 5, 2024 at 6:45 PM Chen Ridong <chenridong@huaweicloud.com> wrote:
>>
>> From: Chen Ridong <chenridong@huawei.com>
>>
>> In the `mem_cgroup_update_lru_size` function, the `lru_size` should be
>> updated by adding `nr_pages` regardless of whether `nr_pages` is greater
>> than 0 or less than 0. To simplify this function, add a check for
>> `nr_pages` == 0. When `nr_pages` is not equal to 0, perform the same
>> actions.
>>
>> Signed-off-by: Chen Ridong <chenridong@huawei.com>
> 
> NAK.
> 
> The commit that added that clearly explains why it was done that way.

Thank you for your reply.

I have read the commit message for ca707239e8a7 ("mm: update_lru_size
warn and reset bad lru_size") before sending my patch. However, I did
not quite understand why we need to deal with the difference between
'nr_pages > 0' and 'nr_pages < 0'.


The 'lru_zone_size' can only be modified in the
'mem_cgroup_update_lru_size' function. Only subtracting 'nr_pages' or
adding 'nr_pages' in a way that causes an overflow can make the size < 0.

For case 1, subtracting 'nr_pages', we should issue a warning if the
size goes below 0. For case 2, when adding 'nr_pages' results in an
overflow, there will be no warning and the size won't be reset to 0 the
first time it occurs . It might be that a warning will be issued the
next time 'mem_cgroup_update_lru_size' is called to modify the
'lru_zone_size'. However, as the commit message said, "the first
occurrence is the most informative," and it seems we have missed that
first occurrence.

As the commit message said: "and then the vast unsigned long size draws
page reclaim into a loop of repeatedly", I think that a warning should
be issued and 'lru_zone_size' should be reset whenever 'size < 0' occurs
for the first time, whether from subtracting or adding 'nr_pages'.

I am be grateful if you can explain more details, it has confused me for
a while. Thank you very much.

Best regards,
Ridong
Hugh Dickins Dec. 6, 2024, 8:24 a.m. UTC | #3
On Fri, 6 Dec 2024, Chen Ridong wrote:
> On 2024/12/6 13:33, Yu Zhao wrote:
> > On Thu, Dec 5, 2024 at 6:45 PM Chen Ridong <chenridong@huaweicloud.com> wrote:
> >> From: Chen Ridong <chenridong@huawei.com>
> >>
> >> In the `mem_cgroup_update_lru_size` function, the `lru_size` should be
> >> updated by adding `nr_pages` regardless of whether `nr_pages` is greater
> >> than 0 or less than 0. To simplify this function, add a check for
> >> `nr_pages` == 0. When `nr_pages` is not equal to 0, perform the same
> >> actions.
> >>
> >> Signed-off-by: Chen Ridong <chenridong@huawei.com>
> > 
> > NAK.
> > 
> > The commit that added that clearly explains why it was done that way.

Thanks Yu: I did spot this going by, but was indeed hoping that someone
else would NAK it, with more politeness than I could muster at the time!

> 
> Thank you for your reply.
> 
> I have read the commit message for ca707239e8a7 ("mm: update_lru_size
> warn and reset bad lru_size") before sending my patch. However, I did
> not quite understand why we need to deal with the difference between
> 'nr_pages > 0' and 'nr_pages < 0'.
> 
> 
> The 'lru_zone_size' can only be modified in the
> 'mem_cgroup_update_lru_size' function. Only subtracting 'nr_pages' or
> adding 'nr_pages' in a way that causes an overflow can make the size < 0.
> 
> For case 1, subtracting 'nr_pages', we should issue a warning if the
> size goes below 0. For case 2, when adding 'nr_pages' results in an
> overflow, there will be no warning and the size won't be reset to 0 the
> first time it occurs . It might be that a warning will be issued the
> next time 'mem_cgroup_update_lru_size' is called to modify the
> 'lru_zone_size'. However, as the commit message said, "the first
> occurrence is the most informative," and it seems we have missed that
> first occurrence.
> 
> As the commit message said: "and then the vast unsigned long size draws
> page reclaim into a loop of repeatedly", I think that a warning should
> be issued and 'lru_zone_size' should be reset whenever 'size < 0' occurs
> for the first time, whether from subtracting or adding 'nr_pages'.

One thing, not obvious, but important to understand, is that WARN_ONCE()
only issues a warning the first time its condition is found true, but
it returns the true or false of the condition every time.  So lru_size
is forced to 0 each time an inconsistency is detected.

(But IIRC VM_WARN_ON_ONCE() does not behave in that useful way; or maybe
I've got that macro name wrong - we have so many slightly differing.)

Perhaps understanding that will help you to make better sense of the
order of events in this function.

Another thing to understand: it's called before adding folio to list,
but after removing folio from list: when it can usefully compare whether
the emptiness of the list correctly matches lru_size 0.  It cannot do so
when adding if you "simplify" it in the way that you did.

You might be focusing too much on the "size < 0" aspect of it, or you
might be worrying more than I did about size growing larger and larger
until it wraps to negative (not likely on 64-bit, unless corrupted).

I hope these remarks help, but you need to think through it again
for yourself.

Hugh

> 
> I am be grateful if you can explain more details, it has confused me for
> a while. Thank you very much.
> 
> Best regards,
> Ridong
Chen Ridong Dec. 6, 2024, 10:02 a.m. UTC | #4
On 2024/12/6 16:24, Hugh Dickins wrote:
> On Fri, 6 Dec 2024, Chen Ridong wrote:
>> On 2024/12/6 13:33, Yu Zhao wrote:
>>> On Thu, Dec 5, 2024 at 6:45 PM Chen Ridong <chenridong@huaweicloud.com> wrote:
>>>> From: Chen Ridong <chenridong@huawei.com>
>>>>
>>>> In the `mem_cgroup_update_lru_size` function, the `lru_size` should be
>>>> updated by adding `nr_pages` regardless of whether `nr_pages` is greater
>>>> than 0 or less than 0. To simplify this function, add a check for
>>>> `nr_pages` == 0. When `nr_pages` is not equal to 0, perform the same
>>>> actions.
>>>>
>>>> Signed-off-by: Chen Ridong <chenridong@huawei.com>
>>>
>>> NAK.
>>>
>>> The commit that added that clearly explains why it was done that way.
> 
> Thanks Yu: I did spot this going by, but was indeed hoping that someone
> else would NAK it, with more politeness than I could muster at the time!
> 
>>
>> Thank you for your reply.
>>
>> I have read the commit message for ca707239e8a7 ("mm: update_lru_size
>> warn and reset bad lru_size") before sending my patch. However, I did
>> not quite understand why we need to deal with the difference between
>> 'nr_pages > 0' and 'nr_pages < 0'.
>>
>>
>> The 'lru_zone_size' can only be modified in the
>> 'mem_cgroup_update_lru_size' function. Only subtracting 'nr_pages' or
>> adding 'nr_pages' in a way that causes an overflow can make the size < 0.
>>
>> For case 1, subtracting 'nr_pages', we should issue a warning if the
>> size goes below 0. For case 2, when adding 'nr_pages' results in an
>> overflow, there will be no warning and the size won't be reset to 0 the
>> first time it occurs . It might be that a warning will be issued the
>> next time 'mem_cgroup_update_lru_size' is called to modify the
>> 'lru_zone_size'. However, as the commit message said, "the first
>> occurrence is the most informative," and it seems we have missed that
>> first occurrence.
>>
>> As the commit message said: "and then the vast unsigned long size draws
>> page reclaim into a loop of repeatedly", I think that a warning should
>> be issued and 'lru_zone_size' should be reset whenever 'size < 0' occurs
>> for the first time, whether from subtracting or adding 'nr_pages'.
> 
> One thing, not obvious, but important to understand, is that WARN_ONCE()
> only issues a warning the first time its condition is found true, but
> it returns the true or false of the condition every time.  So lru_size
> is forced to 0 each time an inconsistency is detected.
> 

Thank you for your explanation.
My patch does not change this logic.

> (But IIRC VM_WARN_ON_ONCE() does not behave in that useful way; or maybe
> I've got that macro name wrong - we have so many slightly differing.)
> 
> Perhaps understanding that will help you to make better sense of the
> order of events in this function.
> 
> Another thing to understand: it's called before adding folio to list,
> but after removing folio from list: when it can usefully compare whether
> the emptiness of the list correctly matches lru_size 0.  It cannot do so
> when adding if you "simplify" it in the way that you did.
> 

The commit b4536f0c829c ("mm, memcg: fix the active list aging for
lowmem requests when memcg is enabled") has removed the emptiness check.

What is meaningful is that we can determine whether the size is smaller
than 0 before adding `nr_pages`. However, as I mentioned, the
`lru_zone_size` can only be modified in the `mem_cgroup_update_lru_size`
function, which means that a warning must have been issued if the size
was smaller than 0 before adding `nr_pages` when `nr_pages` is greater
than 0. In this case, it will not issue a warning again.

Perhaps "when it can usefully compare whether the emptiness of the list
correctly matches lru_size 0" is not useful now.

> You might be focusing too much on the "size < 0" aspect of it, or you
> might be worrying more than I did about size growing larger and larger
> until it wraps to negative (not likely on 64-bit, unless corrupted).> I hope these remarks help, but you need to think through it again
> for yourself.
> 
> Hugh

Thank you very much for your patience.

Best regards,
Ridong

> 
>>
>> I am be grateful if you can explain more details, it has confused me for
>> a while. Thank you very much.
>>
>> Best regards,
>> Ridong
Shakeel Butt Dec. 6, 2024, 9:20 p.m. UTC | #5
On Fri, Dec 06, 2024 at 12:24:54AM -0800, Hugh Dickins wrote:
[...]
> Another thing to understand: it's called before adding folio to list,
> but after removing folio from list: when it can usefully compare whether
> the emptiness of the list correctly matches lru_size 0.

I think one source of confusion might be that this "emptiness" check has
been removed by commit b4536f0c829c because of maintaining the list size
per-zone and actual list is shared between zones of a node.

> It cannot do so
> when adding if you "simplify" it in the way that you did.
>
Chen Ridong Dec. 10, 2024, 11:35 a.m. UTC | #6
On 2024/12/7 5:20, Shakeel Butt wrote:
> On Fri, Dec 06, 2024 at 12:24:54AM -0800, Hugh Dickins wrote:
> [...]
>> Another thing to understand: it's called before adding folio to list,
>> but after removing folio from list: when it can usefully compare whether
>> the emptiness of the list correctly matches lru_size 0.
> 
> I think one source of confusion might be that this "emptiness" check has
> been removed by commit b4536f0c829c because of maintaining the list size
> per-zone and actual list is shared between zones of a node.
> 

Agree.
Maybe it doesn't have to distinguish between  "size > 0" and "size < 0" now?

Thanks,
Ridong
>> It cannot do so
>> when adding if you "simplify" it in the way that you did.
>>
diff mbox series

Patch

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index da6e4e9bd0fa..f977e0be1c04 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1280,15 +1280,14 @@  void mem_cgroup_update_lru_size(struct lruvec *lruvec, enum lru_list lru,
 	unsigned long *lru_size;
 	long size;
 
-	if (mem_cgroup_disabled())
+	if (mem_cgroup_disabled() || !nr_pages)
 		return;
 
 	mz = container_of(lruvec, struct mem_cgroup_per_node, lruvec);
 	lru_size = &mz->lru_zone_size[zid][lru];
 
-	if (nr_pages < 0)
-		*lru_size += nr_pages;
 
+	*lru_size += nr_pages;
 	size = *lru_size;
 	if (WARN_ONCE(size < 0,
 		"%s(%p, %d, %d): lru_size %ld\n",
@@ -1296,9 +1295,6 @@  void mem_cgroup_update_lru_size(struct lruvec *lruvec, enum lru_list lru,
 		VM_BUG_ON(1);
 		*lru_size = 0;
 	}
-
-	if (nr_pages > 0)
-		*lru_size += nr_pages;
 }
 
 /**