Message ID | 20241014221231.832959-1-weixugc@google.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [1/2] mm/mglru: only clear kswapd_failures if reclaimable | expand |
On Mon, 14 Oct 2024 22:12:31 +0000 Wei Xu <weixugc@google.com> wrote: > folio_activate() calls lru_gen_add_folio() to move the folio to the > youngest generation. But unlike folio_update_gen()/folio_inc_gen(), > lru_gen_add_folio() doesn't reset the folio lru tier bits > (LRU_REFS_MASK | LRU_REFS_FLAGS). Fix this inconsistency in > lru_gen_add_folio() when activating a folio. What are the runtime effects of this flaw?
On Mon, Oct 14, 2024 at 4:27 PM Andrew Morton <akpm@linux-foundation.org> wrote: > > On Mon, 14 Oct 2024 22:12:31 +0000 Wei Xu <weixugc@google.com> wrote: > > > folio_activate() calls lru_gen_add_folio() to move the folio to the > > youngest generation. But unlike folio_update_gen()/folio_inc_gen(), > > lru_gen_add_folio() doesn't reset the folio lru tier bits > > (LRU_REFS_MASK | LRU_REFS_FLAGS). Fix this inconsistency in > > lru_gen_add_folio() when activating a folio. > > What are the runtime effects of this flaw? It can affect how pages get aged via the MGLRU PID controller, though no bad behaviors clearly related to this have been detected at runtime. The fix is to address this inconsistency identified via code inspection.
On Mon, Oct 14, 2024 at 4:12 PM Wei Xu <weixugc@google.com> wrote: > > folio_activate() calls lru_gen_add_folio() to move the folio to the > youngest generation. But unlike folio_update_gen()/folio_inc_gen(), > lru_gen_add_folio() doesn't reset the folio lru tier bits > (LRU_REFS_MASK | LRU_REFS_FLAGS). Fix this inconsistency in > lru_gen_add_folio() when activating a folio. > > Fixes: 018ee47f1489 ("mm: multi-gen LRU: exploit locality in rmap") > Signed-off-by: Wei Xu <weixugc@google.com> > --- > include/linux/mm_inline.h | 5 ++++- > include/linux/mmzone.h | 2 ++ > mm/vmscan.c | 2 -- > 3 files changed, 6 insertions(+), 3 deletions(-) > > diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h > index 6f801c7b36e2..87580e8363ef 100644 > --- a/include/linux/mm_inline.h > +++ b/include/linux/mm_inline.h > @@ -222,6 +222,7 @@ static inline bool lru_gen_add_folio(struct lruvec *lruvec, struct folio *folio, > { > unsigned long seq; > unsigned long flags; > + unsigned long mask; > int gen = folio_lru_gen(folio); > int type = folio_is_file_lru(folio); > int zone = folio_zonenum(folio); > @@ -257,7 +258,9 @@ static inline bool lru_gen_add_folio(struct lruvec *lruvec, struct folio *folio, > gen = lru_gen_from_seq(seq); > flags = (gen + 1UL) << LRU_GEN_PGOFF; > /* see the comment on MIN_NR_GENS about PG_active */ > - set_mask_bits(&folio->flags, LRU_GEN_MASK | BIT(PG_active), flags); > + mask = LRU_GEN_MASK | BIT(PG_active); > + mask |= folio_test_active(folio) ? (LRU_REFS_MASK | LRU_REFS_FLAGS) : 0; We shouldn't clear PG_workingset here because it can affect PSI accounting, if the activation is due to workingset refault. Also, nit: mask = LRU_GEN_MASK; if (folio_test_active(folio)) mask |= LRU_REFS_MASK | BIT(PG_active) | BIT(PG_referenced); > + set_mask_bits(&folio->flags, mask, flags); > > lru_gen_update_size(lruvec, folio, -1, gen); > /* for folio_rotate_reclaimable() */ > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h > index 17506e4a2835..96dea31fb211 100644 > --- a/include/linux/mmzone.h > +++ b/include/linux/mmzone.h > @@ -403,6 +403,8 @@ enum { > NR_LRU_GEN_CAPS > }; > > +#define LRU_REFS_FLAGS (BIT(PG_referenced) | BIT(PG_workingset)) > + > #define MIN_LRU_BATCH BITS_PER_LONG > #define MAX_LRU_BATCH (MIN_LRU_BATCH * 64) > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 9d1e1c4e383d..907262ebaef8 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -2601,8 +2601,6 @@ static bool should_clear_pmd_young(void) > * shorthand helpers > ******************************************************************************/ > > -#define LRU_REFS_FLAGS (BIT(PG_referenced) | BIT(PG_workingset)) > - > #define DEFINE_MAX_SEQ(lruvec) \ > unsigned long max_seq = READ_ONCE((lruvec)->lrugen.max_seq) > > -- > 2.47.0.rc1.288.g06298d1525-goog > >
On Tue, 15 Oct 2024 22:55:23 -0600 Yu Zhao <yuzhao@google.com> wrote: > > @@ -257,7 +258,9 @@ static inline bool lru_gen_add_folio(struct lruvec *lruvec, struct folio *folio, > > gen = lru_gen_from_seq(seq); > > flags = (gen + 1UL) << LRU_GEN_PGOFF; > > /* see the comment on MIN_NR_GENS about PG_active */ > > - set_mask_bits(&folio->flags, LRU_GEN_MASK | BIT(PG_active), flags); > > + mask = LRU_GEN_MASK | BIT(PG_active); > > + mask |= folio_test_active(folio) ? (LRU_REFS_MASK | LRU_REFS_FLAGS) : 0; > > We shouldn't clear PG_workingset here because it can affect PSI > accounting, if the activation is due to workingset refault. > > Also, nit: > mask = LRU_GEN_MASK; > if (folio_test_active(folio)) > mask |= LRU_REFS_MASK | BIT(PG_active) | BIT(PG_referenced); > Thanks, I'll drop this version of this patch. When resending, please include a full description of the userspace-visible effects of the original flaw, thanks.
On Wed, Oct 16, 2024 at 3:55 PM Andrew Morton <akpm@linux-foundation.org> wrote: > > On Tue, 15 Oct 2024 22:55:23 -0600 Yu Zhao <yuzhao@google.com> wrote: > > > > @@ -257,7 +258,9 @@ static inline bool lru_gen_add_folio(struct lruvec *lruvec, struct folio *folio, > > > gen = lru_gen_from_seq(seq); > > > flags = (gen + 1UL) << LRU_GEN_PGOFF; > > > /* see the comment on MIN_NR_GENS about PG_active */ > > > - set_mask_bits(&folio->flags, LRU_GEN_MASK | BIT(PG_active), flags); > > > + mask = LRU_GEN_MASK | BIT(PG_active); > > > + mask |= folio_test_active(folio) ? (LRU_REFS_MASK | LRU_REFS_FLAGS) : 0; > > > > We shouldn't clear PG_workingset here because it can affect PSI > > accounting, if the activation is due to workingset refault. > > Good point. I have addressed this in the v2 patch. > > Also, nit: > > mask = LRU_GEN_MASK; > > if (folio_test_active(folio)) > > mask |= LRU_REFS_MASK | BIT(PG_active) | BIT(PG_referenced); > > > > Thanks, I'll drop this version of this patch. > > When resending, please include a full description of the userspace-visible > effects of the original flaw, thanks. I have sent out a v2 patch, which includes a description as suggested.
diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h index 6f801c7b36e2..87580e8363ef 100644 --- a/include/linux/mm_inline.h +++ b/include/linux/mm_inline.h @@ -222,6 +222,7 @@ static inline bool lru_gen_add_folio(struct lruvec *lruvec, struct folio *folio, { unsigned long seq; unsigned long flags; + unsigned long mask; int gen = folio_lru_gen(folio); int type = folio_is_file_lru(folio); int zone = folio_zonenum(folio); @@ -257,7 +258,9 @@ static inline bool lru_gen_add_folio(struct lruvec *lruvec, struct folio *folio, gen = lru_gen_from_seq(seq); flags = (gen + 1UL) << LRU_GEN_PGOFF; /* see the comment on MIN_NR_GENS about PG_active */ - set_mask_bits(&folio->flags, LRU_GEN_MASK | BIT(PG_active), flags); + mask = LRU_GEN_MASK | BIT(PG_active); + mask |= folio_test_active(folio) ? (LRU_REFS_MASK | LRU_REFS_FLAGS) : 0; + set_mask_bits(&folio->flags, mask, flags); lru_gen_update_size(lruvec, folio, -1, gen); /* for folio_rotate_reclaimable() */ diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 17506e4a2835..96dea31fb211 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -403,6 +403,8 @@ enum { NR_LRU_GEN_CAPS }; +#define LRU_REFS_FLAGS (BIT(PG_referenced) | BIT(PG_workingset)) + #define MIN_LRU_BATCH BITS_PER_LONG #define MAX_LRU_BATCH (MIN_LRU_BATCH * 64) diff --git a/mm/vmscan.c b/mm/vmscan.c index 9d1e1c4e383d..907262ebaef8 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -2601,8 +2601,6 @@ static bool should_clear_pmd_young(void) * shorthand helpers ******************************************************************************/ -#define LRU_REFS_FLAGS (BIT(PG_referenced) | BIT(PG_workingset)) - #define DEFINE_MAX_SEQ(lruvec) \ unsigned long max_seq = READ_ONCE((lruvec)->lrugen.max_seq)
folio_activate() calls lru_gen_add_folio() to move the folio to the youngest generation. But unlike folio_update_gen()/folio_inc_gen(), lru_gen_add_folio() doesn't reset the folio lru tier bits (LRU_REFS_MASK | LRU_REFS_FLAGS). Fix this inconsistency in lru_gen_add_folio() when activating a folio. Fixes: 018ee47f1489 ("mm: multi-gen LRU: exploit locality in rmap") Signed-off-by: Wei Xu <weixugc@google.com> --- include/linux/mm_inline.h | 5 ++++- include/linux/mmzone.h | 2 ++ mm/vmscan.c | 2 -- 3 files changed, 6 insertions(+), 3 deletions(-)