Message ID | 20240710083641.546-1-justinjiang@vivo.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v10] mm: shrink skip folio mapped by an exiting process | expand |
On Wed, 10 Jul 2024 16:36:41 +0800 Zhiguo Jiang <justinjiang@vivo.com> wrote: > The releasing process of the non-shared anonymous folio mapped solely by > an exiting process may go through two flows: 1) the anonymous folio is > firstly is swaped-out into swapspace and transformed into a swp_entry > in shrink_folio_list; 2) then the swp_entry is released in the process > exiting flow. This will result in the high cpu load of releasing a > non-shared anonymous folio mapped solely by an exiting process. > > When the low system memory and the exiting process exist at the same > time, it will be likely to happen, because the non-shared anonymous > folio mapped solely by an exiting process may be reclaimed by > shrink_folio_list. > > This patch is that shrink skips the non-shared anonymous folio solely > mapped by an exting process and this folio is only released directly in > the process exiting flow, which will save swap-out time and alleviate > the load of the process exiting. Has any testing been performed to demonstrate any benefit? If so, what were the results?
On Thu, Jul 25, 2024 at 7:55 AM Andrew Morton <akpm@linux-foundation.org> wrote: > > On Wed, 10 Jul 2024 16:36:41 +0800 Zhiguo Jiang <justinjiang@vivo.com> wrote: > > > The releasing process of the non-shared anonymous folio mapped solely by > > an exiting process may go through two flows: 1) the anonymous folio is > > firstly is swaped-out into swapspace and transformed into a swp_entry > > in shrink_folio_list; 2) then the swp_entry is released in the process > > exiting flow. This will result in the high cpu load of releasing a > > non-shared anonymous folio mapped solely by an exiting process. > > > > When the low system memory and the exiting process exist at the same > > time, it will be likely to happen, because the non-shared anonymous > > folio mapped solely by an exiting process may be reclaimed by > > shrink_folio_list. > > > > This patch is that shrink skips the non-shared anonymous folio solely > > mapped by an exting process and this folio is only released directly in > > the process exiting flow, which will save swap-out time and alleviate > > the load of the process exiting. > > Has any testing been performed to demonstrate any benefit? If so, what > were the results? I think I shared my demonstration in version 7: https://lore.kernel.org/linux-mm/20240710033212.36497-1-21cnbao@gmail.com/ I noticed a significant improvement with my small test program. I observed that this patch effectively skipped 6114 folios (either 4KB or 64KB mTHP), potentially reducing the swap-out by up to 92MB (97,300,480 bytes) during the process exit. The working set size is 256MB. If Zhiguo can add more test data from different (real) workloads, it would be greatly appreciated. Thanks Barry
diff --git a/mm/rmap.c b/mm/rmap.c index 86787df6e212..316a6bb9747b 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -75,6 +75,7 @@ #include <linux/memremap.h> #include <linux/userfaultfd_k.h> #include <linux/mm_inline.h> +#include <linux/oom.h> #include <asm/tlbflush.h> @@ -870,6 +871,20 @@ static bool folio_referenced_one(struct folio *folio, continue; } + /* + * Skip the non-shared swapbacked folio mapped solely by + * the exiting or OOM-reaped process. This avoids redundant + * swap-out followed by an immediate unmap. + */ + if ((!atomic_read(&vma->vm_mm->mm_users) || + check_stable_address_space(vma->vm_mm)) && + folio_test_anon(folio) && folio_test_swapbacked(folio) && + !folio_likely_mapped_shared(folio)) { + pra->referenced = -1; + page_vma_mapped_walk_done(&pvmw); + return false; + } + if (pvmw.pte) { if (lru_gen_enabled() && pte_young(ptep_get(pvmw.pte))) { diff --git a/mm/vmscan.c b/mm/vmscan.c index 0761f91b407f..9afe4bb5ba87 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -863,7 +863,12 @@ static enum folio_references folio_check_references(struct folio *folio, if (vm_flags & VM_LOCKED) return FOLIOREF_ACTIVATE; - /* rmap lock contention: rotate */ + /* + * There are two cases to consider. + * 1) Rmap lock contention: rotate. + * 2) Skip the non-shared swapbacked folio mapped solely by + * the exiting or OOM-reaped process. + */ if (referenced_ptes == -1) return FOLIOREF_KEEP;