mbox series

[bpf-next,v2,0/6] bpf, mm: Introduce __GFP_TRYLOCK

Message ID 20241210023936.46871-1-alexei.starovoitov@gmail.com (mailing list archive)
Headers show
Series bpf, mm: Introduce __GFP_TRYLOCK | expand

Message

Alexei Starovoitov Dec. 10, 2024, 2:39 a.m. UTC
From: Alexei Starovoitov <ast@kernel.org>

Hi All,

This is a more complete patch set that introduces __GFP_TRYLOCK
for opportunistic page allocation and lockless page freeing.
It's usable for bpf as-is.
The main motivation is to remove bpf_mem_alloc and make
alloc page and slab reentrant.
These patch set is a first step. Once try_alloc_pages() is available
new_slab() can be converted to it and the rest of kmalloc/slab_alloc.

I started hacking kmalloc() to replace bpf_mem_alloc() completely,
but ___slab_alloc() is quite complex to convert to trylock.
Mainly deactivate_slab part. It cannot fail, but when only trylock
is available I'm running out of ideas.
So far I'm thinking to limit it to:
- USE_LOCKLESS_FAST_PATH
  Which would mean that we would need to keep bpf_mem_alloc only for RT :(
- slab->flags & __CMPXCHG_DOUBLE, because various debugs cannot work in
  trylock mode. bit slab_lock() cannot be made to work with trylock either.
- simple kasan poison/unposion, since kasan_kmalloc and kasan_slab_free are
  too fancy with their own locks.

v1->v2:
- fixed buggy try_alloc_pages_noprof() in PREEMPT_RT. Thanks Peter.
- optimize all paths by doing spin_trylock_irqsave() first
  and only then check for gfp_flags & __GFP_TRYLOCK.
  Then spin_lock_irqsave() if it's a regular mode.
  So new gfp flag will not add performance overhead.
- patches 2-5 are new. They introduce lockless and/or trylock free_pages_nolock()
  and memcg support. So it's in usable shape for bpf in patch 6.

v1:
https://lore.kernel.org/bpf/20241116014854.55141-1-alexei.starovoitov@gmail.com/

Alexei Starovoitov (6):
  mm, bpf: Introduce __GFP_TRYLOCK for opportunistic page allocation
  mm, bpf: Introduce free_pages_nolock()
  locking/local_lock: Introduce local_trylock_irqsave()
  memcg: Add __GFP_TRYLOCK support.
  mm, bpf: Use __GFP_ACCOUNT in try_alloc_pages().
  bpf: Use try_alloc_pages() to allocate pages for bpf needs.

 include/linux/gfp.h                 | 25 ++++++++
 include/linux/gfp_types.h           |  3 +
 include/linux/local_lock.h          |  9 +++
 include/linux/local_lock_internal.h | 23 +++++++
 include/linux/mm_types.h            |  4 ++
 include/linux/mmzone.h              |  3 +
 include/trace/events/mmflags.h      |  1 +
 kernel/bpf/syscall.c                |  4 +-
 mm/fail_page_alloc.c                |  6 ++
 mm/internal.h                       |  1 +
 mm/memcontrol.c                     | 21 +++++--
 mm/page_alloc.c                     | 94 +++++++++++++++++++++++++----
 tools/perf/builtin-kmem.c           |  1 +
 13 files changed, 177 insertions(+), 18 deletions(-)