[2/2] mm/vmscan: shrink slab in node reclaim

Message ID	1557389269-31315-2-git-send-email-laoar.shao@gmail.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <owner-linux-mm@kvack.org> Received-SPF: pass (google.com: domain of laoar.shao@gmail.com designates 209.85.220.65 as permitted sender) client-ip=209.85.220.65; From: Yafang Shao <laoar.shao@gmail.com> To: mhocko@suse.com, akpm@linux-foundation.org Cc: linux-mm@kvack.org, shaoyafang@didiglobal.com, Yafang Shao <laoar.shao@gmail.com> Subject: [PATCH 2/2] mm/vmscan: shrink slab in node reclaim Date: Thu, 9 May 2019 16:07:49 +0800 Message-Id: <1557389269-31315-2-git-send-email-laoar.shao@gmail.com> In-Reply-To: <1557389269-31315-1-git-send-email-laoar.shao@gmail.com> References: <1557389269-31315-1-git-send-email-laoar.shao@gmail.com> Sender: owner-linux-mm@kvack.org Precedence: bulk
Series	[1/2] mm/vmstat: expose min_slab_pages in /proc/zoneinfo \| expand [1/2] mm/vmstat: expose min_slab_pages in /proc/zoneinfo [2/2] mm/vmscan: shrink slab in node reclaim

Message ID

1557389269-31315-2-git-send-email-laoar.shao@gmail.com (mailing list archive)

State

New, archived

Headers

Received-SPF: pass (google.com: domain of laoar.shao@gmail.com designates
 209.85.220.65 as permitted sender) client-ip=209.85.220.65;
From: Yafang Shao <laoar.shao@gmail.com>
To: mhocko@suse.com,
	akpm@linux-foundation.org
Cc: linux-mm@kvack.org,
	shaoyafang@didiglobal.com,
	Yafang Shao <laoar.shao@gmail.com>
Subject: [PATCH 2/2] mm/vmscan: shrink slab in node reclaim
Date: Thu,  9 May 2019 16:07:49 +0800
Message-Id: <1557389269-31315-2-git-send-email-laoar.shao@gmail.com>
In-Reply-To: <1557389269-31315-1-git-send-email-laoar.shao@gmail.com>
References: <1557389269-31315-1-git-send-email-laoar.shao@gmail.com>
Sender: owner-linux-mm@kvack.org
Precedence: bulk

Series

[1/2] mm/vmstat: expose min_slab_pages in /proc/zoneinfo | expand

Commit Message

Yafang Shao May 9, 2019, 8:07 a.m. UTC

In the node reclaim, may_shrinkslab is 0 by default,
hence shrink_slab will never be performed in it.
While shrik_slab should be performed if the relcaimable slab is over
min slab limit.

This issue is very easy to produce, first you continuously cat a random
non-exist file to produce more and more dentry, then you read big file
to produce page cache. And finally you will find that the denty will
never be shrunk.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
---
 mm/vmscan.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

Comments

Andrew Morton May 22, 2019, 9:40 p.m. UTC | #1

On Thu,  9 May 2019 16:07:49 +0800 Yafang Shao <laoar.shao@gmail.com> wrote:

> In the node reclaim, may_shrinkslab is 0 by default,
> hence shrink_slab will never be performed in it.
> While shrik_slab should be performed if the relcaimable slab is over
> min slab limit.
> 
> This issue is very easy to produce, first you continuously cat a random
> non-exist file to produce more and more dentry, then you read big file
> to produce page cache. And finally you will find that the denty will
> never be shrunk.

It does sound like an oversight.

> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -4141,6 +4141,8 @@ static int __node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned in
>  		.may_unmap = !!(node_reclaim_mode & RECLAIM_UNMAP),
>  		.may_swap = 1,
>  		.reclaim_idx = gfp_zone(gfp_mask),
> +		.may_shrinkslab = node_page_state(pgdat, NR_SLAB_RECLAIMABLE) >
> +				  pgdat->min_slab_pages,
>  	};
>  
>  	trace_mm_vmscan_node_reclaim_begin(pgdat->node_id, order,
> @@ -4158,15 +4160,13 @@ static int __node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned in
>  	reclaim_state.reclaimed_slab = 0;
>  	p->reclaim_state = &reclaim_state;
>  
> -	if (node_pagecache_reclaimable(pgdat) > pgdat->min_unmapped_pages) {

Would it be better to do

	if (node_pagecache_reclaimable(pgdat) > pgdat->min_unmapped_pages ||
			sc.may_shrinkslab) {

>  		/*
>  		 * Free memory by calling shrink node with increasing
>  		 * priorities until we have enough memory freed.
>  		 */

The above will want re-indenting and re-right-justifying.

> -		do {
> -			shrink_node(pgdat, &sc);
> -		} while (sc.nr_reclaimed < nr_pages && --sc.priority >= 0);
> -	}
> +	do {
> +		shrink_node(pgdat, &sc);
> +	} while (sc.nr_reclaimed < nr_pages && --sc.priority >= 0);

Won't this cause pagecache reclaim and compaction which previously did
not occur?  If yes, what are the effects of this and are they
desirable?  If no, perhaps call shrink_slab() directly in this case. 
Or something like that.

It's unclear why min_unmapped_pages (min_unmapped_ratio) exists.  Is it
a batch-things-up efficiency thing?

Yafang Shao May 23, 2019, 4:56 a.m. UTC | #2

On Thu, May 23, 2019 at 5:40 AM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> On Thu,  9 May 2019 16:07:49 +0800 Yafang Shao <laoar.shao@gmail.com> wrote:
>
> > In the node reclaim, may_shrinkslab is 0 by default,
> > hence shrink_slab will never be performed in it.
> > While shrik_slab should be performed if the relcaimable slab is over
> > min slab limit.
> >
> > This issue is very easy to produce, first you continuously cat a random
> > non-exist file to produce more and more dentry, then you read big file
> > to produce page cache. And finally you will find that the denty will
> > never be shrunk.
>
> It does sound like an oversight.
>
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -4141,6 +4141,8 @@ static int __node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned in
> >               .may_unmap = !!(node_reclaim_mode & RECLAIM_UNMAP),
> >               .may_swap = 1,
> >               .reclaim_idx = gfp_zone(gfp_mask),
> > +             .may_shrinkslab = node_page_state(pgdat, NR_SLAB_RECLAIMABLE) >
> > +                               pgdat->min_slab_pages,
> >       };
> >
> >       trace_mm_vmscan_node_reclaim_begin(pgdat->node_id, order,
> > @@ -4158,15 +4160,13 @@ static int __node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned in
> >       reclaim_state.reclaimed_slab = 0;
> >       p->reclaim_state = &reclaim_state;
> >
> > -     if (node_pagecache_reclaimable(pgdat) > pgdat->min_unmapped_pages) {
>
> Would it be better to do
>
>         if (node_pagecache_reclaimable(pgdat) > pgdat->min_unmapped_pages ||
>                         sc.may_shrinkslab) {
>

This if condition is always true here, because we already check them
in node_reclaim(),
see bellow,

    if (node_pagecache_reclaimable(pgdat) <= pgdat->min_unmapped_pages &&
        node_page_state(pgdat, NR_SLAB_RECLAIMABLE) <= pgdat->min_slab_pages)
        return NODE_RECLAIM_FULL;


> >               /*
> >                * Free memory by calling shrink node with increasing
> >                * priorities until we have enough memory freed.
> >                */
>
> The above will want re-indenting and re-right-justifying.
>

Sorry about the carelessness.

> > -             do {
> > -                     shrink_node(pgdat, &sc);
> > -             } while (sc.nr_reclaimed < nr_pages && --sc.priority >= 0);
> > -     }
> > +     do {
> > +             shrink_node(pgdat, &sc);
> > +     } while (sc.nr_reclaimed < nr_pages && --sc.priority >= 0);
>
> Won't this cause pagecache reclaim and compaction which previously did
> not occur?  If yes, what are the effects of this and are they
> desirable?  If no, perhaps call shrink_slab() directly in this case.
> Or something like that.
>

It may cause pagecache reclaim and compaction even if
node_pagecache_reclaimable() is still less than
pgdat->min_unmapped_pages.
The active file will be deactivated and the inactive file will be recaimed.
(I traced these behavior with mm_vmscan_lru_shrink_active and
mm_vmscan_lru_shrink_inactive tracepoint)

If we don't like these behavior, what about bellow change ?

@@ -4166,6 +4166,17 @@ static int __node_reclaim(struct pglist_data
*pgdat, gfp_t gfp_mask, unsigned in
                do {
                        shrink_node(pgdat, &sc);
                } while (sc.nr_reclaimed < nr_pages && --sc.priority >= 0);
+       } else {
+               struct mem_cgroup *memcg;
+               struct mem_cgroup_reclaim_cookie reclaim = {
+                        .pgdat = pgdat,
+                        .priority = sc.priority,
+                };
+
+               memcg = mem_cgroup_iter(false, NULL, &reclaim);
+               do {
+                       shrink_slab(sc.gfp_mask, pgdat->node_id,
memcg, sc.priority);
+               } while ((memcg = mem_cgroup_iter(false, memcg, &reclaim)));

        }


> It's unclear why min_unmapped_pages (min_unmapped_ratio) exists. Is it

I have tried to understand it, but still don't have a clear idea yet.
So I just let it as-is.

> a batch-things-up efficiency thing?

I guess so.

Thanks
Yafang

diff --git a/mm/vmscan.c b/mm/vmscan.c
index d9c3e87..2c73223 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -4141,6 +4141,8 @@  static int __node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned in
 		.may_unmap = !!(node_reclaim_mode & RECLAIM_UNMAP),
 		.may_swap = 1,
 		.reclaim_idx = gfp_zone(gfp_mask),
+		.may_shrinkslab = node_page_state(pgdat, NR_SLAB_RECLAIMABLE) >
+				  pgdat->min_slab_pages,
 	};
 
 	trace_mm_vmscan_node_reclaim_begin(pgdat->node_id, order,
@@ -4158,15 +4160,13 @@  static int __node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned in
 	reclaim_state.reclaimed_slab = 0;
 	p->reclaim_state = &reclaim_state;
 
-	if (node_pagecache_reclaimable(pgdat) > pgdat->min_unmapped_pages) {
 		/*
 		 * Free memory by calling shrink node with increasing
 		 * priorities until we have enough memory freed.
 		 */
-		do {
-			shrink_node(pgdat, &sc);
-		} while (sc.nr_reclaimed < nr_pages && --sc.priority >= 0);
-	}
+	do {
+		shrink_node(pgdat, &sc);
+	} while (sc.nr_reclaimed < nr_pages && --sc.priority >= 0);
 
 	p->reclaim_state = NULL;
 	current->flags &= ~PF_SWAPWRITE;

[2/2] mm/vmscan: shrink slab in node reclaim

Commit Message

Comments

Patch