Message ID | 20201014005320.2233162-3-kaleshsingh@google.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Speed up mremap on large regions | expand |
On Wed, Oct 14, 2020 at 12:53:07AM +0000, Kalesh Singh wrote: > HAVE_MOVE_PMD enables remapping pages at the PMD level if both the > source and destination addresses are PMD-aligned. > > HAVE_MOVE_PMD is already enabled on x86. The original patch [1] that > introduced this config did not enable it on arm64 at the time because > of performance issues with flushing the TLB on every PMD move. These > issues have since been addressed in more recent releases with > improvements to the arm64 TLB invalidation and core mmu_gather code as > Will Deacon mentioned in [2]. > > From the data below, it can be inferred that there is approximately > 8x improvement in performance when HAVE_MOVE_PMD is enabled on arm64. > > --------- Test Results ---------- > > The following results were obtained on an arm64 device running a 5.4 > kernel, by remapping a PMD-aligned, 1GB sized region to a PMD-aligned > destination. The results from 10 iterations of the test are given below. > All times are in nanoseconds. > > Control HAVE_MOVE_PMD > > 9220833 1247761 > 9002552 1219896 > 9254115 1094792 > 8725885 1227760 > 9308646 1043698 > 9001667 1101771 > 8793385 1159896 > 8774636 1143594 > 9553125 1025833 > 9374010 1078125 > > 9100885.4 1134312.6 <-- Mean Time in nanoseconds > > Total mremap time for a 1GB sized PMD-aligned region drops from > ~9.1 milliseconds to ~1.1 milliseconds. (~8x speedup). > > [1] https://lore.kernel.org/r/20181108181201.88826-3-joelaf@google.com > [2] https://www.mail-archive.com/linuxppc-dev@lists.ozlabs.org/msg140837.html > > Signed-off-by: Kalesh Singh <kaleshsingh@google.com> > Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > Cc: Catalin Marinas <catalin.marinas@arm.com> > Cc: Will Deacon <will@kernel.org> > Cc: Andrew Morton <akpm@linux-foundation.org> > --- > Changes in v4: > - Add Kirill's Acked-by. Argh, I thought we already enabled this for PMDs back in 2018! Looks like that we forgot to actually do that after I improved the performance of the TLB invalidation. I'll pick this one patch up for 5.10. Will
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 4b136e923ccb..434d6791e869 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -123,6 +123,7 @@ config ARM64 select GENERIC_VDSO_TIME_NS select HANDLE_DOMAIN_IRQ select HARDIRQS_SW_RESEND + select HAVE_MOVE_PMD select HAVE_PCI select HAVE_ACPI_APEI if (ACPI && EFI) select HAVE_ALIGNED_STRUCT_PAGE if SLUB