diff mbox series

[RFC,v2] mm/swap.c: Enable promotion of unmapped MGLRU page cache pages

Message ID 20250403141032.22743-1-donettom@linux.ibm.com (mailing list archive)
State New
Headers show
Series [RFC,v2] mm/swap.c: Enable promotion of unmapped MGLRU page cache pages | expand

Commit Message

Donet Tom April 3, 2025, 2:10 p.m. UTC
This patch is based on patch [1], which introduced support for
promoting unmapped normal LRU page cache pages. Here, we extend
the functionality to support promotion of MGLRU page cache pages.

An MGLRU page cache page is eligible for promotion when:

1. Memory Tiering and pagecache_promotion_enabled are enabled
2. It resides in a lower memory tier.
3. It is referenced.
4. It is part of the working set.
5. folio reference count is maximun (LRU_REFS_MASK).

When a page is accessed through a file descriptor, folio_inc_refs()
is invoked. The first access will set the folio’s referenced flag,
and subsequent accesses will increment the reference count in the
folio flag (reference counter size in folio flags is 2 bits). Once
the referenced flag is set, and the folio’s reference count reaches
the maximum value (LRU_REFS_MASK), the working set flag will be set
as well.

If a folio has both the referenced and working set flags set, and its
reference count equals LRU_REFS_MASK, it becomes a good candidate for
promotion. These pages will be added to the promotion list. The
per-process task task_numa_promotion_work() takes the pages from the
promotion list and promotes them to a higher memory tier.

In the MGLRU, for folios accessed through a file descriptor, if the
folio’s referenced and working set flags are set, and the folio's
reference count is equal to LRU_REFS_MASK, the folio is lazily
promoted to the second oldest generation in the eviction path. When
folio_inc_gen() does this, it clears the LRU_REFS_FLAGS so that
lru_gen_inc_refs() can start over.

Test process:
We measured the read time in below scenarios for both LRU and MGLRU.
Scenario 1: Pages are on Lower tier + promotion off
Scenario 2: Pages are on Lower tier + promotion on
Scenario 3: Pages are on higher tier

Test Results MGLRU
----------------------------------------------------------------
Pages on higher   | Pages Lower tier |  Pages on Lower Tier    |
   Tier           |  promotion off   |   Promotion On          |
----------------------------------------------------------------
  0.48s           |    1.6s          |During Promotion - 3.3s  |
                  |                  |After Promotion  - 0.48s |
                  |                  |                         |
----------------------------------------------------------------

Test Results LRU
----------------------------------------------------------------
Pages on higher   | Pages Lower tier |  Pages on Lower Tier    |
   Tier           |  promotion off   |   Promotion On          |
----------------------------------------------------------------
   0.48s          |    1.6s          |During Promotion - 3.3s  |
                  |                  |After Promotion  - 0.48s |
                  |                  |                         |
----------------------------------------------------------------

MGLRU and LRU are showing similar performance benefit.

[1] https://lore.kernel.org/all/20250107000346.1338481-1-gourry@gourry.net/

Signed-off-by: Donet Tom <donettom@linux.ibm.com>
---
v1->v2

In V1, the folios that were part of the memcg and the active MGLRU list
were being promoted. However, in MGLRU, file pages accessed through
file descriptors are moved to the second oldest generation. This second
oldest generation may not necessarily be part of the active list.
Furthermore, this movement to the second oldest generation only happens
in the eviction path, so if the system is not under memory pressure,
this movement will not occur. As a result, hot pages can be present in
any generation. If the reference count is at its maximum and the
referenced and working set flags are set, the page becomes a candidate
for promotion.

v1 - https://lore.kernel.org/all/20250115120625.3785-1-donettom@linux.ibm.com/
---
 mm/swap.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)
diff mbox series

Patch

diff --git a/mm/swap.c b/mm/swap.c
index b2341bc18452..f3c19d563556 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -399,8 +399,13 @@  static void lru_gen_inc_refs(struct folio *folio)
 
 	do {
 		if ((old_flags & LRU_REFS_MASK) == LRU_REFS_MASK) {
-			if (!folio_test_workingset(folio))
+			if (!folio_test_workingset(folio)) {
 				folio_set_workingset(folio);
+			} else if (!folio_test_isolated(folio) &&
+				  (sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING) &&
+				   numa_pagecache_promotion_enabled) {
+				promotion_candidate(folio);
+			}
 			return;
 		}