mbox series

[RFC,0/2] Reduce TLB flushes under some specific conditions

Message ID 20230804061850.21498-1-byungchul@sk.com (mailing list archive)
Headers show
Series Reduce TLB flushes under some specific conditions | expand

Message

Byungchul Park Aug. 4, 2023, 6:18 a.m. UTC
Hi,

While I'm working with CXL, I have been facing migraion overhead esp.
TLB shootdown on promotion or demotion between different tiers. Yeah..
most TLB shootdowns on migration through hinting fault can be avoided
thanks to Huang Ying's work, commit 4d4b6d66db ("mm,unmap: avoid
flushing TLB in batch if PTE is inaccessible").

However, it's only for ones using hinting fault. I thought it'd be much
better if we have a general mechanism to reduce # of TLB flushes that
we can apply to any type of migration. I tried it only for tiering
migration for now tho.

I'm suggesting a mechanism to reduce TLB flushes by keeping source and
destination of folios participated in the migrations until all TLB
flushes required are done, only if those folios are not mapped with
write permission PTE entries at all.

I saw the number of TLB full flush reduced over 50% and the performance
a little bit improved but not that big with the workload I tested with,
XSBench. However, I believe that it would help more with other ones or
any real ones. It'd be appreciated to tell me if I'm missing something.

	Byungchul

Byungchul Park (2):
  mm/rmap: Recognize non-writable TLB entries during TLB batch flush
  mm: Defer TLB flush by keeping both src and dst folios at migration

 arch/x86/include/asm/tlbflush.h |   9 +
 arch/x86/mm/tlb.c               |  59 +++++++
 include/linux/mm.h              |  30 ++++
 include/linux/mm_types.h        |  34 ++++
 include/linux/mm_types_task.h   |   4 +-
 include/linux/mmzone.h          |   6 +
 include/linux/sched.h           |   5 +
 init/Kconfig                    |  12 ++
 mm/internal.h                   |  14 ++
 mm/memory.c                     |   9 +-
 mm/migrate.c                    | 287 +++++++++++++++++++++++++++++++-
 mm/mm_init.c                    |   1 +
 mm/page_alloc.c                 |  16 ++
 mm/rmap.c                       | 121 +++++++++++++-
 14 files changed, 595 insertions(+), 12 deletions(-)