[v1,bpf-next,1/2] bpf: Support BPF_F_MMAPABLE task_local storage

This patch modifies the generic bpf_local_storage infrastructure to
support mmapable map values and adds mmap() handling to task_local
storage leveraging this new functionality. A userspace task which
mmap's a task_local storage map will receive a pointer to the map_value
corresponding to that tasks' key - mmap'ing in other tasks' mapvals is
not supported in this patch.

Currently, struct bpf_local_storage_elem contains both bookkeeping
information as well as a struct bpf_local_storage_data with additional
bookkeeping information and the actual mapval data. We can't simply map
the page containing this struct into userspace. Instead, mmapable
local_storage uses bpf_local_storage_data's data field to point to the
actual mapval, which is allocated separately such that it can be
mmapped. Only the mapval lives on the page(s) allocated for it.

The lifetime of the actual_data mmapable region is tied to the
bpf_local_storage_elem which points to it. This doesn't necessarily mean
that the pages go away when the bpf_local_storage_elem is free'd - if
they're mapped into some userspace process they will remain until
unmapped, but are no longer the task_local storage's mapval.

Implementation details:

  * A few small helpers are added to deal with bpf_local_storage_data's
    'data' field having different semantics when the local_storage map
    is mmapable. With their help, many of the changes to existing code
    are purely mechanical (e.g. sdata->data becomes sdata_mapval(sdata),
    selem->elem_size becomes selem_bytes_used(selem)).

  * The map flags are copied into bpf_local_storage_data when its
    containing bpf_local_storage_elem is alloc'd, since the
    bpf_local_storage_map associated with them may be gone when
    bpf_local_storage_data is free'd, and testing flags for
    BPF_F_MMAPABLE is necessary when free'ing to ensure that the
    mmapable region is free'd.
    * The extra field doesn't change bpf_local_storage_elem's size.
      There were 48 bytes of padding after the bpf_local_storage_data
      field, now there are 40.

  * Currently, bpf_local_storage_update always creates a new
    bpf_local_storage_elem for the 'updated' value - the only exception
    being if the map_value has a bpf_spin_lock field, in which case the
    spin lock is grabbed instead of the less granular bpf_local_storage
    lock, and the value updated in place. This inplace update behavior
    is desired for mmapable local_storage map_values as well, since
    creating a new selem would result in new mmapable pages.

  * The size of the mmapable pages are accounted for when calling
    mem_{charge,uncharge}. If the pages are mmap'd into a userspace task
    mem_uncharge may be called before they actually go away.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
---
 include/linux/bpf_local_storage.h |  14 ++-
 kernel/bpf/bpf_local_storage.c    | 145 ++++++++++++++++++++++++------
 kernel/bpf/bpf_task_storage.c     |  35 ++++++--
 kernel/bpf/syscall.c              |   2 +-
 4 files changed, 163 insertions(+), 33 deletions(-)

Message ID	20231120175925.733167-2-davemarchevsky@fb.com (mailing list archive)
State	Changes Requested
Delegated to:	BPF
Headers	show Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=fb.com header.i=@fb.com header.b="qjblR0nW" From: Dave Marchevsky <davemarchevsky@fb.com> To: <bpf@vger.kernel.org> CC: Alexei Starovoitov <ast@kernel.org>, Daniel Borkmann <daniel@iogearbox.net>, Andrii Nakryiko <andrii@kernel.org>, Martin KaFai Lau <martin.lau@kernel.org>, Kernel Team <kernel-team@fb.com>, Johannes Weiner <hannes@cmpxchg.org>, Dave Marchevsky <davemarchevsky@fb.com> Subject: [PATCH v1 bpf-next 1/2] bpf: Support BPF_F_MMAPABLE task_local storage Date: Mon, 20 Nov 2023 09:59:24 -0800 Message-ID: <20231120175925.733167-2-davemarchevsky@fb.com> In-Reply-To: <20231120175925.733167-1-davemarchevsky@fb.com> References: <20231120175925.733167-1-davemarchevsky@fb.com> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain
Series	bpf: Add mmapable task_local storage \| expand [v1,bpf-next,0/2] bpf: Add mmapable task_local storage [v1,bpf-next,1/2] bpf: Support BPF_F_MMAPABLE task_local storage [v1,bpf-next,2/2] selftests/bpf: Add test exercising mmapable task_local_storage

Context	Check	Description
netdev/series_format	success	Posting correctly formatted
netdev/codegen	success	Generated files up to date
netdev/tree_selection	success	Clearly marked for bpf-next
netdev/fixes_present	success	Fixes tag not required for -next series
netdev/header_inline	success	No static functions without inline keyword in header files
netdev/build_32bit	fail	Errors and warnings before: 1198 this patch: 1205
netdev/cc_maintainers	warning	8 maintainers not CCed: yonghong.song@linux.dev john.fastabend@gmail.com song@kernel.org haoluo@google.com jolsa@kernel.org kpsingh@kernel.org martin.lau@linux.dev sdf@google.com
netdev/build_clang	fail	Errors and warnings before: 1162 this patch: 1169
netdev/verify_signedoff	success	Signed-off-by tag matches author and committer
netdev/deprecated_api	success	None detected
netdev/check_selftest	success	No net selftest shell script
netdev/verify_fixes	success	No Fixes tag
netdev/build_allmodconfig_warn	fail	Errors and warnings before: 1225 this patch: 1232
netdev/checkpatch	warning	WARNING: Missing a blank line after declarations WARNING: line length of 81 exceeds 80 columns WARNING: line length of 83 exceeds 80 columns WARNING: line length of 87 exceeds 80 columns WARNING: line length of 89 exceeds 80 columns
netdev/build_clang_rust	success	No Rust files in patch. Skipping build
netdev/kdoc	success	Errors and warnings before: 0 this patch: 0
netdev/source_inline	success	Was 0 now: 0
bpf/vmtest-bpf-next-PR	fail	PR summary
bpf/vmtest-bpf-next-VM_Test-0	success	Logs for Lint
bpf/vmtest-bpf-next-VM_Test-8	success	Logs for aarch64-gcc / veristat
bpf/vmtest-bpf-next-VM_Test-3	success	Logs for aarch64-gcc / build / build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-1	success	Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-2	success	Logs for Validate matrix.py
bpf/vmtest-bpf-next-VM_Test-4	success	Logs for aarch64-gcc / test (test_maps, false, 360) / test_maps on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-7	success	Logs for aarch64-gcc / test (test_verifier, false, 360) / test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-5	success	Logs for aarch64-gcc / test (test_progs, false, 360) / test_progs on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-6	success	Logs for aarch64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-9	success	Logs for s390x-gcc / build / build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-14	success	Logs for s390x-gcc / veristat
bpf/vmtest-bpf-next-VM_Test-15	success	Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-19	success	Logs for x86_64-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-16	success	Logs for x86_64-gcc / build / build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-22	success	Logs for x86_64-gcc / test (test_verifier, false, 360) / test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-23	success	Logs for x86_64-gcc / veristat / veristat on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-17	success	Logs for x86_64-gcc / test (test_maps, false, 360) / test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-20	success	Logs for x86_64-gcc / test (test_progs_no_alu32_parallel, true, 30) / test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-18	success	Logs for x86_64-gcc / test (test_progs, false, 360) / test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-25	success	Logs for x86_64-llvm-16 / test (test_maps, false, 360) / test_maps on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-26	fail	Logs for x86_64-llvm-16 / test (test_progs, false, 360) / test_progs on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-28	success	Logs for x86_64-llvm-16 / test (test_verifier, false, 360) / test_verifier on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-24	success	Logs for x86_64-llvm-16 / build / build for x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-27	success	Logs for x86_64-llvm-16 / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-29	success	Logs for x86_64-llvm-16 / veristat
bpf/vmtest-bpf-next-VM_Test-21	success	Logs for x86_64-gcc / test (test_progs_parallel, true, 30) / test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-13	success	Logs for s390x-gcc / test (test_verifier, false, 360) / test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-12	success	Logs for s390x-gcc / test (test_progs_no_alu32, false, 360) / test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-11	success	Logs for s390x-gcc / test (test_progs, false, 360) / test_progs on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-10	success	Logs for s390x-gcc / test (test_maps, false, 360) / test_maps on s390x with gcc

[v1,bpf-next,1/2] bpf: Support BPF_F_MMAPABLE task_local storage

Checks

Commit Message

Comments

Patch