Message ID | 20240408080539.14282-1-liuhailong@oppo.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm: vmscan: do not skip CMA while LRU is full of CMA folios | expand |
On Mon, Apr 8, 2024 at 8:06 PM <liuhailong@oppo.com> wrote: > > From: liuhailong <liuhailong@oppo.com> > > If the allocation flag isn't movable, commit 5da226dbfce3 ("mm: > skip CMA pages when they are not available") skips CMA during direct > reclamation. This generally speeds up the reclamation of non-movable > folios. However, in scenarios with limited system memory and a majority > of reclaimable folios in LRU being from CMA, this can result in prolonged > idle loops where the system reclaims nothing but consumes CPU. > > I traced the process of a thread entering direct reclamation, > and printed the relevant information of the sc(scan_control). > __alloc_pages_direct_reclaim start > sc->priority:9 sc->nr_skipped_cma:32208 sc->nr_scanned:36 sc->nr_reclaimed:3 > sc->priority:8 sc->nr_skipped_cma:32199 sc->nr_scanned:69 sc->nr_reclaimed:3 > sc->priority:7 sc->nr_skipped_cma:198405 sc->nr_scanned:121 sc->nr_reclaimed:3 > sc->priority:6 sc->nr_skipped_cma:236713 sc->nr_scanned:147 sc->nr_reclaimed:3 > sc->priority:5 sc->nr_skipped_cma:708209 sc->nr_scanned:379 sc->nr_reclaimed:3 > sc->priority:4 sc->nr_skipped_cma:785537 sc->nr_scanned:646 sc->nr_reclaimed:3 > __alloc_pages_direct_reclaim end duration 3356ms > > Continuously skipping CMA even when the LRU is filled with CMA > folios can also result in lmkd failing to terminate processes. The > duration of psi_memstall (measured from the exit to the entry of > __alloc_pages_direct_reclaim) becomes excessively long, lasting for > example a couple of seconds. Consequently, lmkd fails to awaken and > terminate processes promptly. If you're planning to send a newer version, it's best to include the reproducer here. > > This patch introduces no_skip_cma and sets it to true when the number of > skipped CMA folios is excessively high. It offers two benefits: Rather > than wasting time in idle loops, it's better to assist other threads in > reclaiming some folios; This shortens the duration of psi_memstall and > ensures timely activation of lmkd within a few milliseconds. > > Signed-off-by: liuhailong <liuhailong@oppo.com> I believe this patch tackles a niche scenario where CMA is configured to be large. Acked-by: Barry Song <baohua@kernel.org> > --- This is evidently version 3 of your previous patch titled "[PATCH v2] Revert " mm: skip CMA pages when they are not available".[1] It's advisable to provide a brief explanation of the changes made from v2 and include a link to v2 here. [1] https://lore.kernel.org/linux-mm/CAGsJ_4wD-NiquhnhK_6TGEF4reTGO9pVNGyYBnYZ1inVFc40WQ@mail.gmail.com/ > mm/vmscan.c | 23 ++++++++++++++++++++++- > 1 file changed, 22 insertions(+), 1 deletion(-) > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index fa321c125099..2c74c1c94d88 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -114,6 +114,9 @@ struct scan_control { > /* Proactive reclaim invoked by userspace through memory.reclaim */ > unsigned int proactive:1; > > + /* Can reclaim skip cma pages */ > + unsigned int no_skip_cma:1; > + > /* > * Cgroup memory below memory.low is protected as long as we > * don't threaten to OOM. If any cgroup is reclaimed at > @@ -157,6 +160,9 @@ struct scan_control { > /* Number of pages freed so far during a call to shrink_zones() */ > unsigned long nr_reclaimed; > > + /* Number of cma-pages skipped so far during a call to shrink_zones() */ > + unsigned long nr_skipped_cma; > + > struct { > unsigned int dirty; > unsigned int unqueued_dirty; > @@ -1572,9 +1578,13 @@ static __always_inline void update_lru_sizes(struct lruvec *lruvec, > */ > static bool skip_cma(struct folio *folio, struct scan_control *sc) > { > - return !current_is_kswapd() && > + bool ret = !current_is_kswapd() && !sc->no_skip_cma && > gfp_migratetype(sc->gfp_mask) != MIGRATE_MOVABLE && > folio_migratetype(folio) == MIGRATE_CMA; > + > + if (ret) > + sc->nr_skipped_cma += folio_nr_pages(folio); > + return ret; > } > #else > static bool skip_cma(struct folio *folio, struct scan_control *sc) > @@ -6188,6 +6198,7 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist, > vmpressure_prio(sc->gfp_mask, sc->target_mem_cgroup, > sc->priority); > sc->nr_scanned = 0; > + sc->nr_skipped_cma = 0; > shrink_zones(zonelist, sc); > > if (sc->nr_reclaimed >= sc->nr_to_reclaim) > @@ -6202,6 +6213,16 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist, > */ > if (sc->priority < DEF_PRIORITY - 2) > sc->may_writepage = 1; > + > + /* > + * If we're getting trouble reclaiming non-cma pages and > + * currently a substantial number of CMA pages on LRU, > + * start reclaiming cma pages to alleviate other threads > + * and decrease lru size. > + */ > + if (sc->priority < DEF_PRIORITY - 2 && > + sc->nr_scanned < (sc->nr_skipped_cma >> 3)) > + sc->no_skip_cma = 1; > } while (--sc->priority >= 0); > > last_pgdat = NULL; > -- > 2.36.1 > Thanks Barry
On Mon, Apr 08, 2024 at 04:05:39PM +0800, liuhailong@oppo.com wrote: > From: liuhailong <liuhailong@oppo.com> > @@ -6202,6 +6213,16 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist, > */ > if (sc->priority < DEF_PRIORITY - 2) > sc->may_writepage = 1; > + > + /* > + * If we're getting trouble reclaiming non-cma pages and > + * currently a substantial number of CMA pages on LRU, "sit on LRU" ? > + * start reclaiming cma pages to alleviate other threads > + * and decrease lru size. > + */ > + if (sc->priority < DEF_PRIORITY - 2 && > + sc->nr_scanned < (sc->nr_skipped_cma >> 3)) Why "sc->nr_skipped_cma >> 3"? It feels a bit hardcoded. Maybe the comment or the changelog should contain a reference about why this "/8" was a good choice.
On 2024/4/8 16:38, Oscar Salvador wrote: > On Mon, Apr 08, 2024 at 04:05:39PM +0800, liuhailong@oppo.com wrote: >> From: liuhailong <liuhailong@oppo.com> >> @@ -6202,6 +6213,16 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist, >> */ >> if (sc->priority < DEF_PRIORITY - 2) >> sc->may_writepage = 1; >> + >> + /* >> + * If we're getting trouble reclaiming non-cma pages and >> + * currently a substantial number of CMA pages on LRU, > "sit on LRU" ? Got it, Thanks > >> + * start reclaiming cma pages to alleviate other threads >> + * and decrease lru size. >> + */ >> + if (sc->priority < DEF_PRIORITY - 2 && >> + sc->nr_scanned < (sc->nr_skipped_cma >> 3)) > > Why "sc->nr_skipped_cma >> 3"? It feels a bit hardcoded. > Maybe the comment or the changelog should contain a reference about why > this "/8" was a good choice. When the number of skipped CMA ages exceeds eight times the number of scanned pages, it indicates that CMA pages constitute the majority of the LRU pages. Setting the value too low may result in premature reclamation of CMA pages, which are unnecessary for non-movable allocations. Conversely, setting it too high may delay problem detection until much later, wasting CPU time in idle loops. Brs, Hailong.
diff --git a/mm/vmscan.c b/mm/vmscan.c index fa321c125099..2c74c1c94d88 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -114,6 +114,9 @@ struct scan_control { /* Proactive reclaim invoked by userspace through memory.reclaim */ unsigned int proactive:1; + /* Can reclaim skip cma pages */ + unsigned int no_skip_cma:1; + /* * Cgroup memory below memory.low is protected as long as we * don't threaten to OOM. If any cgroup is reclaimed at @@ -157,6 +160,9 @@ struct scan_control { /* Number of pages freed so far during a call to shrink_zones() */ unsigned long nr_reclaimed; + /* Number of cma-pages skipped so far during a call to shrink_zones() */ + unsigned long nr_skipped_cma; + struct { unsigned int dirty; unsigned int unqueued_dirty; @@ -1572,9 +1578,13 @@ static __always_inline void update_lru_sizes(struct lruvec *lruvec, */ static bool skip_cma(struct folio *folio, struct scan_control *sc) { - return !current_is_kswapd() && + bool ret = !current_is_kswapd() && !sc->no_skip_cma && gfp_migratetype(sc->gfp_mask) != MIGRATE_MOVABLE && folio_migratetype(folio) == MIGRATE_CMA; + + if (ret) + sc->nr_skipped_cma += folio_nr_pages(folio); + return ret; } #else static bool skip_cma(struct folio *folio, struct scan_control *sc) @@ -6188,6 +6198,7 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist, vmpressure_prio(sc->gfp_mask, sc->target_mem_cgroup, sc->priority); sc->nr_scanned = 0; + sc->nr_skipped_cma = 0; shrink_zones(zonelist, sc); if (sc->nr_reclaimed >= sc->nr_to_reclaim) @@ -6202,6 +6213,16 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist, */ if (sc->priority < DEF_PRIORITY - 2) sc->may_writepage = 1; + + /* + * If we're getting trouble reclaiming non-cma pages and + * currently a substantial number of CMA pages on LRU, + * start reclaiming cma pages to alleviate other threads + * and decrease lru size. + */ + if (sc->priority < DEF_PRIORITY - 2 && + sc->nr_scanned < (sc->nr_skipped_cma >> 3)) + sc->no_skip_cma = 1; } while (--sc->priority >= 0); last_pgdat = NULL;