mbox series

[RFC,v3,0/5] mm: support large folios swap-in

Message ID 20240304081348.197341-1-21cnbao@gmail.com (mailing list archive)
Headers show
Series mm: support large folios swap-in | expand

Message

Barry Song March 4, 2024, 8:13 a.m. UTC
From: Barry Song <v-songbaohua@oppo.com>

-v3:
 * avoid over-writing err in __swap_duplicate_nr, pointed out by Yosry,
   thanks!
 * fix the issue folio is charged twice for do_swap_page, separating
   alloc_anon_folio and alloc_swap_folio as they have many differences
   now on
   * memcg charing
   * clearing allocated folio or not

-v2:
 https://lore.kernel.org/linux-mm/20240229003753.134193-1-21cnbao@gmail.com/
 * lots of code cleanup according to Chris's comments, thanks!
 * collect Chris's ack tags, thanks!
 * address David's comment on moving to use folio_add_new_anon_rmap
   for !folio_test_anon in do_swap_page, thanks!
 * remove the MADV_PAGEOUT patch from this series as Ryan will
   intergrate it into swap-out series
 * Apply Kairui's work of "mm/swap: fix race when skipping swapcache"
   on large folios swap-in as well
 * fixed corrupted data(zero-filled data) in two races: zswap and
   a part of entries are in swapcache while some others are not
   in by checking SWAP_HAS_CACHE while swapping in a large folio

-v1:
 https://lore.kernel.org/all/20240118111036.72641-1-21cnbao@gmail.com/#t

On an embedded system like Android, more than half of anon memory is actually
in swap devices such as zRAM. For example, while an app is switched to back-
ground, its most memory might be swapped-out.

Now we have mTHP features, unfortunately, if we don't support large folios
swap-in, once those large folios are swapped-out, we immediately lose the 
performance gain we can get through large folios and hardware optimization
such as CONT-PTE.

In theory, we don't need to rely on Ryan's swap out patchset[1]. That is to say,
before swap-out, if some memory were normal pages, but when swapping in, we
can also swap-in them as large folios. But this might require I/O happen at
some random places in swap devices. So we limit the large folios swap-in to
those areas which were large folios before swapping-out, aka, swaps are also
contiguous in swapdevice. On the other hand, in OPPO's product, we've deployed
anon large folios on millions of phones[2]. we enhanced zsmalloc and zRAM to
compress and decompress large folios as a whole, which help improve compression
ratio and decrease CPU consumption significantly. In zsmalloc and zRAM we can
save large objects whose original size are 64KiB for example (related patches
are coming). So it is also a good choice for us to support swap-in large
folios for those compressed large objects as a large folio can be decompressed
all together.

Note I am moving my previous "arm64: mm: swap: support THP_SWAP on hardware
with MTE" to this series as it might help review.

[1] [PATCH v3 0/4] Swap-out small-sized THP without splitting
https://lore.kernel.org/linux-mm/20231025144546.577640-1-ryan.roberts@arm.com/
[2] OnePlusOSS / android_kernel_oneplus_sm8550 
https://github.com/OnePlusOSS/android_kernel_oneplus_sm8550/tree/oneplus/sm8550_u_14.0.0_oneplus11

Barry Song (2):
  arm64: mm: swap: support THP_SWAP on hardware with MTE
  mm: swap: introduce swapcache_prepare_nr and swapcache_clear_nr for
    large folios swap-in

Chuanhua Han (3):
  mm: swap: introduce swap_nr_free() for batched swap_free()
  mm: swap: make should_try_to_free_swap() support large-folio
  mm: support large folios swapin as a whole

 arch/arm64/include/asm/pgtable.h |  19 +--
 arch/arm64/mm/mteswap.c          |  43 ++++++
 include/linux/huge_mm.h          |  12 --
 include/linux/pgtable.h          |   2 +-
 include/linux/swap.h             |   7 +
 mm/memory.c                      | 252 ++++++++++++++++++++++++++-----
 mm/page_io.c                     |   2 +-
 mm/swap.h                        |   1 +
 mm/swap_slots.c                  |   2 +-
 mm/swapfile.c                    | 153 +++++++++++++------
 10 files changed, 376 insertions(+), 117 deletions(-)