mbox series

[hotfix,6.12,0/2] introduce VMA merge mode to improve brk() performance

Message ID cover.1729174352.git.lorenzo.stoakes@oracle.com (mailing list archive)
Headers show
Series introduce VMA merge mode to improve brk() performance | expand

Message

Lorenzo Stoakes Oct. 17, 2024, 2:31 p.m. UTC
A ~5% performance regression was discovered on the
aim9.brk_test.ops_per_sec by the linux kernel test bot [0].

In the past to satisfy brk() performance we duplicated VMA expansion code
and special-cased do_brk_flags(). This is however horrid and undoes work to
abstract this logic, so in resolving the issue I have endeavoured to avoid
this.

Investigating further I was able to observe that the use of a
vma_iter_next_range() and vma_prev() pair, causing an unnecessary maple
tree walk. In addition there is work that we do that is simply unnecessary
for brk().

Therefore, add a special VMA merge mode VMG_FLAG_JUST_EXPAND to avoid doing
any of this - it assumes the VMA iterator is pointing at the previous VMA
and which skips logic that brk() does not require.

This mostly eliminates the performance regression reducing it to ~2% which
is in the realm of noise. In addition, the will-it-scale test brk2, written
to be more representative of real-world brk() usage, shows a modest
performance improvement - which gives me confidence that we are not
meaningfully regressing real workloads here.

This series includes a test asserting that the 'just expand' mode works as
expected.

With many thanks to Oliver Sang for helping with performance testing of
candidate patch sets!

[0]:https://lore.kernel.org/linux-mm/202409301043.629bea78-oliver.sang@intel.com

Lorenzo Stoakes (2):
  mm/vma: add expand-only VMA merge mode and optimise do_brk_flags()
  tools: testing: add expand-only mode VMA test

 mm/mmap.c               |  3 ++-
 mm/vma.c                | 23 +++++++++++++++--------
 mm/vma.h                | 14 ++++++++++++++
 tools/testing/vma/vma.c | 40 ++++++++++++++++++++++++++++++++++++++++
 4 files changed, 71 insertions(+), 9 deletions(-)

--
2.46.2