mbox series

[v7,bpf-next,0/5] Open-coded task_vma iter

Message ID 20231013204426.1074286-1-davemarchevsky@fb.com (mailing list archive)
Headers show
Series Open-coded task_vma iter | expand

Message

Dave Marchevsky Oct. 13, 2023, 8:44 p.m. UTC
At Meta we have a profiling daemon which periodically collects
information on many hosts. This collection usually involves grabbing
stacks (user and kernel) using perf_event BPF progs and later symbolicating
them. For user stacks we try to use BPF_F_USER_BUILD_ID and rely on
remote symbolication, but BPF_F_USER_BUILD_ID doesn't always succeed. In
those cases we must fall back to digging around in /proc/PID/maps to map
virtual address to (binary, offset). The /proc/PID/maps digging does not
occur synchronously with stack collection, so the process might already
be gone, in which case it won't have /proc/PID/maps and we will fail to
symbolicate.

This 'exited process problem' doesn't occur very often as
most of the prod services we care to profile are long-lived daemons, but
there are enough usecases to warrant a workaround: a BPF program which
can be optionally loaded at data collection time and essentially walks
/proc/PID/maps. Currently this is done by walking the vma list:

  struct vm_area_struct* mmap = BPF_CORE_READ(mm, mmap);
  mmap_next = BPF_CORE_READ(rmap, vm_next); /* in a loop */

Since commit 763ecb035029 ("mm: remove the vma linked list") there's no
longer a vma linked list to walk. Walking the vma maple tree is not as
simple as hopping struct vm_area_struct->vm_next. Luckily,
commit f39af05949a4 ("mm: add VMA iterator"), another commit in that series,
added struct vma_iterator and for_each_vma macro for easy vma iteration. If
similar functionality was exposed to BPF programs, it would be perfect for our
usecase.

This series adds such functionality, specifically a BPF equivalent of
for_each_vma using the open-coded iterator style.

Notes:
  * This approach was chosen after discussion on a previous series [0] which
    attempted to solve the same problem by adding a BPF_F_VMA_NEXT flag to
    bpf_find_vma.
  * Unlike the task_vma bpf_iter, the open-coded iterator kfuncs here do not
    drop the vma read lock between iterations. See Alexei's response in [0].
  * The [vsyscall] page isn't really part of task->mm's vmas, but
    /proc/PID/maps returns information about it anyways. The vma iter added
    here does not do the same. See comment on selftest in patch 3.
  * bpf_iter_task_vma allocates a _data struct which contains - among other
    things - struct vma_iterator, using BPF allocator and keeps a pointer to
    the bpf_iter_task_vma_data. This is done in order to prevent changes to
    struct ma_state - which is wrapped by struct vma_iterator - from
    necessitating changes to uapi struct bpf_iter_task_vma.

Changelog:

v6 -> v7: https://lore.kernel.org/bpf/20231010185944.3888849-1-davemarchevsky@fb.com/

Patch numbers correspond to their position in v6

Patch 2 ("selftests/bpf: Rename bpf_iter_task_vma.c to bpf_iter_task_vmas.c")
  * Add Andrii ack
Patch 3 ("bpf: Introduce task_vma open-coded iterator kfuncs")
  * Add Andrii ack
  * Add missing __diag_ignore_all for -Wmissing-prototypes (Song)
Patch 4 ("selftests/bpf: Add tests for open-coded task_vma iter")
  * Remove two unnecessary header includes (Andrii)
  * Remove extraneous !vmas_seen check (Andrii)
New Patch ("bpf: Add BPF_KFUNC_{START,END}_defs macros")
  * After talking to Andrii, this is an attempt to clean up __diag_ignore_all
    spam everywhere kfuncs are defined. If nontrivial changes are needed,
    let's apply the other 4 and I'll respin as a standalone patch.

v5 -> v6: https://lore.kernel.org/bpf/20231010175637.3405682-1-davemarchevsky@fb.com/

Patch 4 ("selftests/bpf: Add tests for open-coded task_vma iter")
  * Remove extraneous blank line. I did this manually to the .patch file
    for v5, which caused BPF CI to complain about failing to apply the
    series

v4 -> v5: https://lore.kernel.org/bpf/20231002195341.2940874-1-davemarchevsky@fb.com/

Patch numbers correspond to their position in v4

New Patch ("selftests/bpf: Rename bpf_iter_task_vma.c to bpf_iter_task_vmas.c")
  * Patch 2's renaming of this selftest, and associated changes in the
    userspace runner, are split out into this separate commit (Andrii)

Patch 2 ("bpf: Introduce task_vma open-coded iterator kfuncs")
  * Remove bpf_iter_task_vma kfuncs from libbpf's bpf_helpers.h, they'll be
    added to selftests' bpf_experimental.h in selftests patch below (Andrii)
  * Split bpf_iter_task_vma.c renaming into separate commit (Andrii)

Patch 3 ("selftests/bpf: Add tests for open-coded task_vma iter")
  * Add bpf_iter_task_vma kfuncs to bpf_experimental.h (Andrii)
  * Remove '?' from prog SEC, open_and_load the skel in one operation (Andrii)
  * Ensure that fclose() always happens in test runner (Andrii)
  * Use global var w/ 1000 (vm_start, vm_end) structs instead of two
    MAP_TYPE_ARRAY's w/ 1k u64s each (Andrii)


v3 -> v4: https://lore.kernel.org/bpf/20230822050558.2937659-1-davemarchevsky@fb.com/

Patch 1 ("bpf: Don't explicitly emit BTF for struct btf_iter_num")
  * Add Andrii ack
Patch 2 ("bpf: Introduce task_vma open-coded iterator kfuncs")
  * Mark bpf_iter_task_vma_new args KF_RCU and remove now-unnecessary !task
    check (Yonghong)
    * Although KF_RCU is a function-level flag, in reality it only applies to
      the task_struct *task parameter, as the other two params are a scalar int
      and a specially-handled KF_ARG_PTR_TO_ITER
   * Remove struct bpf_iter_task_vma definition from uapi headers, define in
     kernel/bpf/task_iter.c instead (Andrii)
Patch 3 ("selftests/bpf: Add tests for open-coded task_vma iter")
  * Use a local var when looping over vmas to track map idx. Update vmas_seen
    global after done iterating. Don't start iterating or update vmas_seen if
    vmas_seen global is nonzero. (Andrii)
  * Move getpgid() call to correct spot - above skel detach. (Andrii)

v2 -> v3: https://lore.kernel.org/bpf/20230821173415.1970776-1-davemarchevsky@fb.com/

Patch 1 ("bpf: Don't explicitly emit BTF for struct btf_iter_num")
  * Add Yonghong ack

Patch 2 ("bpf: Introduce task_vma open-coded iterator kfuncs")
  * UAPI bpf header and tools/ version should match
  * Add bpf_iter_task_vma_kern_data which bpf_iter_task_vma_kern points to,
    bpf_mem_alloc/free it instead of just vma_iterator. (Alexei)
    * Inner data ptr == NULL implies initialization failed


v1 -> v2: https://lore.kernel.org/bpf/20230810183513.684836-1-davemarchevsky@fb.com/
  * Patch 1
    * Now removes the unnecessary BTF_TYPE_EMIT instead of changing the
      type (Yonghong)
  * Patch 2
    * Don't do unnecessary BTF_TYPE_EMIT (Yonghong)
    * Bump task refcount to prevent ->mm reuse (Yonghong)
    * Keep a pointer to vma_iterator in bpf_iter_task_vma, alloc/free
      via BPF mem allocator (Yonghong, Stanislav)
  * Patch 3

  [0]: https://lore.kernel.org/bpf/20230801145414.418145-1-davemarchevsky@fb.com/

Dave Marchevsky (5):
  bpf: Don't explicitly emit BTF for struct btf_iter_num
  selftests/bpf: Rename bpf_iter_task_vma.c to bpf_iter_task_vmas.c
  bpf: Introduce task_vma open-coded iterator kfuncs
  selftests/bpf: Add tests for open-coded task_vma iter
  bpf: Add BPF_KFUNC_{START,END}_defs macros

 include/linux/btf.h                           |  7 ++
 kernel/bpf/bpf_iter.c                         |  8 +-
 kernel/bpf/cpumask.c                          |  6 +-
 kernel/bpf/helpers.c                          |  9 +-
 kernel/bpf/map_iter.c                         |  6 +-
 kernel/bpf/task_iter.c                        | 89 +++++++++++++++++++
 kernel/trace/bpf_trace.c                      |  6 +-
 net/bpf/test_run.c                            |  7 +-
 net/core/filter.c                             | 13 +--
 net/core/xdp.c                                |  6 +-
 net/ipv4/fou_bpf.c                            |  6 +-
 net/netfilter/nf_conntrack_bpf.c              |  6 +-
 net/netfilter/nf_nat_bpf.c                    |  6 +-
 net/xfrm/xfrm_interface_bpf.c                 |  6 +-
 .../testing/selftests/bpf/bpf_experimental.h  |  8 ++
 .../selftests/bpf/prog_tests/bpf_iter.c       | 26 +++---
 .../testing/selftests/bpf/prog_tests/iters.c  | 58 ++++++++++++
 ...f_iter_task_vma.c => bpf_iter_task_vmas.c} |  0
 .../selftests/bpf/progs/iters_task_vma.c      | 43 +++++++++
 19 files changed, 248 insertions(+), 68 deletions(-)
 rename tools/testing/selftests/bpf/progs/{bpf_iter_task_vma.c => bpf_iter_task_vmas.c} (100%)
 create mode 100644 tools/testing/selftests/bpf/progs/iters_task_vma.c

Comments

David Marchevsky Oct. 13, 2023, 8:48 p.m. UTC | #1
On 10/13/23 4:44 PM, Dave Marchevsky wrote:
> At Meta we have a profiling daemon which periodically collects
> information on many hosts. This collection usually involves grabbing
> stacks (user and kernel) using perf_event BPF progs and later symbolicating
> them. For user stacks we try to use BPF_F_USER_BUILD_ID and rely on
> remote symbolication, but BPF_F_USER_BUILD_ID doesn't always succeed. In
> those cases we must fall back to digging around in /proc/PID/maps to map
> virtual address to (binary, offset). The /proc/PID/maps digging does not
> occur synchronously with stack collection, so the process might already
> be gone, in which case it won't have /proc/PID/maps and we will fail to
> symbolicate.
> 
> This 'exited process problem' doesn't occur very often as
> most of the prod services we care to profile are long-lived daemons, but
> there are enough usecases to warrant a workaround: a BPF program which
> can be optionally loaded at data collection time and essentially walks
> /proc/PID/maps. Currently this is done by walking the vma list:
> 
>   struct vm_area_struct* mmap = BPF_CORE_READ(mm, mmap);
>   mmap_next = BPF_CORE_READ(rmap, vm_next); /* in a loop */
> 
> Since commit 763ecb035029 ("mm: remove the vma linked list") there's no
> longer a vma linked list to walk. Walking the vma maple tree is not as
> simple as hopping struct vm_area_struct->vm_next. Luckily,
> commit f39af05949a4 ("mm: add VMA iterator"), another commit in that series,
> added struct vma_iterator and for_each_vma macro for easy vma iteration. If
> similar functionality was exposed to BPF programs, it would be perfect for our
> usecase.
> 
> This series adds such functionality, specifically a BPF equivalent of
> for_each_vma using the open-coded iterator style.
> 
> Notes:
>   * This approach was chosen after discussion on a previous series [0] which
>     attempted to solve the same problem by adding a BPF_F_VMA_NEXT flag to
>     bpf_find_vma.
>   * Unlike the task_vma bpf_iter, the open-coded iterator kfuncs here do not
>     drop the vma read lock between iterations. See Alexei's response in [0].
>   * The [vsyscall] page isn't really part of task->mm's vmas, but
>     /proc/PID/maps returns information about it anyways. The vma iter added
>     here does not do the same. See comment on selftest in patch 3.
>   * bpf_iter_task_vma allocates a _data struct which contains - among other
>     things - struct vma_iterator, using BPF allocator and keeps a pointer to
>     the bpf_iter_task_vma_data. This is done in order to prevent changes to
>     struct ma_state - which is wrapped by struct vma_iterator - from
>     necessitating changes to uapi struct bpf_iter_task_vma.
> 
> Changelog:
> 
> v6 -> v7: https://lore.kernel.org/bpf/20231010185944.3888849-1-davemarchevsky@fb.com/
> 
> Patch numbers correspond to their position in v6
> 
> Patch 2 ("selftests/bpf: Rename bpf_iter_task_vma.c to bpf_iter_task_vmas.c")
>   * Add Andrii ack
> Patch 3 ("bpf: Introduce task_vma open-coded iterator kfuncs")
>   * Add Andrii ack
>   * Add missing __diag_ignore_all for -Wmissing-prototypes (Song)
> Patch 4 ("selftests/bpf: Add tests for open-coded task_vma iter")
>   * Remove two unnecessary header includes (Andrii)
>   * Remove extraneous !vmas_seen check (Andrii)

Whoops, forgot to add another bullet here:
  * Initialize FILE *f = NULL to address llvm-16 CI warnings (Andrii)

> New Patch ("bpf: Add BPF_KFUNC_{START,END}_defs macros")
>   * After talking to Andrii, this is an attempt to clean up __diag_ignore_all
>     spam everywhere kfuncs are defined. If nontrivial changes are needed,
>     let's apply the other 4 and I'll respin as a standalone patch.
> 
> v5 -> v6: https://lore.kernel.org/bpf/20231010175637.3405682-1-davemarchevsky@fb.com/
> 
> Patch 4 ("selftests/bpf: Add tests for open-coded task_vma iter")
>   * Remove extraneous blank line. I did this manually to the .patch file
>     for v5, which caused BPF CI to complain about failing to apply the
>     series
> 
> v4 -> v5: https://lore.kernel.org/bpf/20231002195341.2940874-1-davemarchevsky@fb.com/
> 
> Patch numbers correspond to their position in v4
> 
> New Patch ("selftests/bpf: Rename bpf_iter_task_vma.c to bpf_iter_task_vmas.c")
>   * Patch 2's renaming of this selftest, and associated changes in the
>     userspace runner, are split out into this separate commit (Andrii)
> 
> Patch 2 ("bpf: Introduce task_vma open-coded iterator kfuncs")
>   * Remove bpf_iter_task_vma kfuncs from libbpf's bpf_helpers.h, they'll be
>     added to selftests' bpf_experimental.h in selftests patch below (Andrii)
>   * Split bpf_iter_task_vma.c renaming into separate commit (Andrii)
> 
> Patch 3 ("selftests/bpf: Add tests for open-coded task_vma iter")
>   * Add bpf_iter_task_vma kfuncs to bpf_experimental.h (Andrii)
>   * Remove '?' from prog SEC, open_and_load the skel in one operation (Andrii)
>   * Ensure that fclose() always happens in test runner (Andrii)
>   * Use global var w/ 1000 (vm_start, vm_end) structs instead of two
>     MAP_TYPE_ARRAY's w/ 1k u64s each (Andrii)
> 
> 
> v3 -> v4: https://lore.kernel.org/bpf/20230822050558.2937659-1-davemarchevsky@fb.com/
> 
> Patch 1 ("bpf: Don't explicitly emit BTF for struct btf_iter_num")
>   * Add Andrii ack
> Patch 2 ("bpf: Introduce task_vma open-coded iterator kfuncs")
>   * Mark bpf_iter_task_vma_new args KF_RCU and remove now-unnecessary !task
>     check (Yonghong)
>     * Although KF_RCU is a function-level flag, in reality it only applies to
>       the task_struct *task parameter, as the other two params are a scalar int
>       and a specially-handled KF_ARG_PTR_TO_ITER
>    * Remove struct bpf_iter_task_vma definition from uapi headers, define in
>      kernel/bpf/task_iter.c instead (Andrii)
> Patch 3 ("selftests/bpf: Add tests for open-coded task_vma iter")
>   * Use a local var when looping over vmas to track map idx. Update vmas_seen
>     global after done iterating. Don't start iterating or update vmas_seen if
>     vmas_seen global is nonzero. (Andrii)
>   * Move getpgid() call to correct spot - above skel detach. (Andrii)
> 
> v2 -> v3: https://lore.kernel.org/bpf/20230821173415.1970776-1-davemarchevsky@fb.com/
> 
> Patch 1 ("bpf: Don't explicitly emit BTF for struct btf_iter_num")
>   * Add Yonghong ack
> 
> Patch 2 ("bpf: Introduce task_vma open-coded iterator kfuncs")
>   * UAPI bpf header and tools/ version should match
>   * Add bpf_iter_task_vma_kern_data which bpf_iter_task_vma_kern points to,
>     bpf_mem_alloc/free it instead of just vma_iterator. (Alexei)
>     * Inner data ptr == NULL implies initialization failed
> 
> 
> v1 -> v2: https://lore.kernel.org/bpf/20230810183513.684836-1-davemarchevsky@fb.com/
>   * Patch 1
>     * Now removes the unnecessary BTF_TYPE_EMIT instead of changing the
>       type (Yonghong)
>   * Patch 2
>     * Don't do unnecessary BTF_TYPE_EMIT (Yonghong)
>     * Bump task refcount to prevent ->mm reuse (Yonghong)
>     * Keep a pointer to vma_iterator in bpf_iter_task_vma, alloc/free
>       via BPF mem allocator (Yonghong, Stanislav)
>   * Patch 3
> 
>   [0]: https://lore.kernel.org/bpf/20230801145414.418145-1-davemarchevsky@fb.com/
> 
> Dave Marchevsky (5):
>   bpf: Don't explicitly emit BTF for struct btf_iter_num
>   selftests/bpf: Rename bpf_iter_task_vma.c to bpf_iter_task_vmas.c
>   bpf: Introduce task_vma open-coded iterator kfuncs
>   selftests/bpf: Add tests for open-coded task_vma iter
>   bpf: Add BPF_KFUNC_{START,END}_defs macros
> 
>  include/linux/btf.h                           |  7 ++
>  kernel/bpf/bpf_iter.c                         |  8 +-
>  kernel/bpf/cpumask.c                          |  6 +-
>  kernel/bpf/helpers.c                          |  9 +-
>  kernel/bpf/map_iter.c                         |  6 +-
>  kernel/bpf/task_iter.c                        | 89 +++++++++++++++++++
>  kernel/trace/bpf_trace.c                      |  6 +-
>  net/bpf/test_run.c                            |  7 +-
>  net/core/filter.c                             | 13 +--
>  net/core/xdp.c                                |  6 +-
>  net/ipv4/fou_bpf.c                            |  6 +-
>  net/netfilter/nf_conntrack_bpf.c              |  6 +-
>  net/netfilter/nf_nat_bpf.c                    |  6 +-
>  net/xfrm/xfrm_interface_bpf.c                 |  6 +-
>  .../testing/selftests/bpf/bpf_experimental.h  |  8 ++
>  .../selftests/bpf/prog_tests/bpf_iter.c       | 26 +++---
>  .../testing/selftests/bpf/prog_tests/iters.c  | 58 ++++++++++++
>  ...f_iter_task_vma.c => bpf_iter_task_vmas.c} |  0
>  .../selftests/bpf/progs/iters_task_vma.c      | 43 +++++++++
>  19 files changed, 248 insertions(+), 68 deletions(-)
>  rename tools/testing/selftests/bpf/progs/{bpf_iter_task_vma.c => bpf_iter_task_vmas.c} (100%)
>  create mode 100644 tools/testing/selftests/bpf/progs/iters_task_vma.c
>
Andrii Nakryiko Oct. 13, 2023, 10:56 p.m. UTC | #2
On Fri, Oct 13, 2023 at 1:48 PM David Marchevsky
<david.marchevsky@linux.dev> wrote:
>
> On 10/13/23 4:44 PM, Dave Marchevsky wrote:
> > At Meta we have a profiling daemon which periodically collects
> > information on many hosts. This collection usually involves grabbing
> > stacks (user and kernel) using perf_event BPF progs and later symbolicating
> > them. For user stacks we try to use BPF_F_USER_BUILD_ID and rely on
> > remote symbolication, but BPF_F_USER_BUILD_ID doesn't always succeed. In
> > those cases we must fall back to digging around in /proc/PID/maps to map
> > virtual address to (binary, offset). The /proc/PID/maps digging does not
> > occur synchronously with stack collection, so the process might already
> > be gone, in which case it won't have /proc/PID/maps and we will fail to
> > symbolicate.
> >
> > This 'exited process problem' doesn't occur very often as
> > most of the prod services we care to profile are long-lived daemons, but
> > there are enough usecases to warrant a workaround: a BPF program which
> > can be optionally loaded at data collection time and essentially walks
> > /proc/PID/maps. Currently this is done by walking the vma list:
> >
> >   struct vm_area_struct* mmap = BPF_CORE_READ(mm, mmap);
> >   mmap_next = BPF_CORE_READ(rmap, vm_next); /* in a loop */
> >
> > Since commit 763ecb035029 ("mm: remove the vma linked list") there's no
> > longer a vma linked list to walk. Walking the vma maple tree is not as
> > simple as hopping struct vm_area_struct->vm_next. Luckily,
> > commit f39af05949a4 ("mm: add VMA iterator"), another commit in that series,
> > added struct vma_iterator and for_each_vma macro for easy vma iteration. If
> > similar functionality was exposed to BPF programs, it would be perfect for our
> > usecase.
> >
> > This series adds such functionality, specifically a BPF equivalent of
> > for_each_vma using the open-coded iterator style.
> >
> > Notes:
> >   * This approach was chosen after discussion on a previous series [0] which
> >     attempted to solve the same problem by adding a BPF_F_VMA_NEXT flag to
> >     bpf_find_vma.
> >   * Unlike the task_vma bpf_iter, the open-coded iterator kfuncs here do not
> >     drop the vma read lock between iterations. See Alexei's response in [0].
> >   * The [vsyscall] page isn't really part of task->mm's vmas, but
> >     /proc/PID/maps returns information about it anyways. The vma iter added
> >     here does not do the same. See comment on selftest in patch 3.
> >   * bpf_iter_task_vma allocates a _data struct which contains - among other
> >     things - struct vma_iterator, using BPF allocator and keeps a pointer to
> >     the bpf_iter_task_vma_data. This is done in order to prevent changes to
> >     struct ma_state - which is wrapped by struct vma_iterator - from
> >     necessitating changes to uapi struct bpf_iter_task_vma.
> >
> > Changelog:
> >
> > v6 -> v7: https://lore.kernel.org/bpf/20231010185944.3888849-1-davemarchevsky@fb.com/
> >
> > Patch numbers correspond to their position in v6
> >
> > Patch 2 ("selftests/bpf: Rename bpf_iter_task_vma.c to bpf_iter_task_vmas.c")
> >   * Add Andrii ack
> > Patch 3 ("bpf: Introduce task_vma open-coded iterator kfuncs")
> >   * Add Andrii ack
> >   * Add missing __diag_ignore_all for -Wmissing-prototypes (Song)
> > Patch 4 ("selftests/bpf: Add tests for open-coded task_vma iter")
> >   * Remove two unnecessary header includes (Andrii)
> >   * Remove extraneous !vmas_seen check (Andrii)
>
> Whoops, forgot to add another bullet here:
>   * Initialize FILE *f = NULL to address llvm-16 CI warnings (Andrii)
>
> > New Patch ("bpf: Add BPF_KFUNC_{START,END}_defs macros")
> >   * After talking to Andrii, this is an attempt to clean up __diag_ignore_all
> >     spam everywhere kfuncs are defined. If nontrivial changes are needed,
> >     let's apply the other 4 and I'll respin as a standalone patch.
> >
> > v5 -> v6: https://lore.kernel.org/bpf/20231010175637.3405682-1-davemarchevsky@fb.com/
> >
> > Patch 4 ("selftests/bpf: Add tests for open-coded task_vma iter")
> >   * Remove extraneous blank line. I did this manually to the .patch file
> >     for v5, which caused BPF CI to complain about failing to apply the
> >     series
> >
> > v4 -> v5: https://lore.kernel.org/bpf/20231002195341.2940874-1-davemarchevsky@fb.com/
> >
> > Patch numbers correspond to their position in v4
> >
> > New Patch ("selftests/bpf: Rename bpf_iter_task_vma.c to bpf_iter_task_vmas.c")
> >   * Patch 2's renaming of this selftest, and associated changes in the
> >     userspace runner, are split out into this separate commit (Andrii)
> >
> > Patch 2 ("bpf: Introduce task_vma open-coded iterator kfuncs")
> >   * Remove bpf_iter_task_vma kfuncs from libbpf's bpf_helpers.h, they'll be
> >     added to selftests' bpf_experimental.h in selftests patch below (Andrii)
> >   * Split bpf_iter_task_vma.c renaming into separate commit (Andrii)
> >
> > Patch 3 ("selftests/bpf: Add tests for open-coded task_vma iter")
> >   * Add bpf_iter_task_vma kfuncs to bpf_experimental.h (Andrii)
> >   * Remove '?' from prog SEC, open_and_load the skel in one operation (Andrii)
> >   * Ensure that fclose() always happens in test runner (Andrii)
> >   * Use global var w/ 1000 (vm_start, vm_end) structs instead of two
> >     MAP_TYPE_ARRAY's w/ 1k u64s each (Andrii)
> >
> >
> > v3 -> v4: https://lore.kernel.org/bpf/20230822050558.2937659-1-davemarchevsky@fb.com/
> >
> > Patch 1 ("bpf: Don't explicitly emit BTF for struct btf_iter_num")
> >   * Add Andrii ack
> > Patch 2 ("bpf: Introduce task_vma open-coded iterator kfuncs")
> >   * Mark bpf_iter_task_vma_new args KF_RCU and remove now-unnecessary !task
> >     check (Yonghong)
> >     * Although KF_RCU is a function-level flag, in reality it only applies to
> >       the task_struct *task parameter, as the other two params are a scalar int
> >       and a specially-handled KF_ARG_PTR_TO_ITER
> >    * Remove struct bpf_iter_task_vma definition from uapi headers, define in
> >      kernel/bpf/task_iter.c instead (Andrii)
> > Patch 3 ("selftests/bpf: Add tests for open-coded task_vma iter")
> >   * Use a local var when looping over vmas to track map idx. Update vmas_seen
> >     global after done iterating. Don't start iterating or update vmas_seen if
> >     vmas_seen global is nonzero. (Andrii)
> >   * Move getpgid() call to correct spot - above skel detach. (Andrii)
> >
> > v2 -> v3: https://lore.kernel.org/bpf/20230821173415.1970776-1-davemarchevsky@fb.com/
> >
> > Patch 1 ("bpf: Don't explicitly emit BTF for struct btf_iter_num")
> >   * Add Yonghong ack
> >
> > Patch 2 ("bpf: Introduce task_vma open-coded iterator kfuncs")
> >   * UAPI bpf header and tools/ version should match
> >   * Add bpf_iter_task_vma_kern_data which bpf_iter_task_vma_kern points to,
> >     bpf_mem_alloc/free it instead of just vma_iterator. (Alexei)
> >     * Inner data ptr == NULL implies initialization failed
> >
> >
> > v1 -> v2: https://lore.kernel.org/bpf/20230810183513.684836-1-davemarchevsky@fb.com/
> >   * Patch 1
> >     * Now removes the unnecessary BTF_TYPE_EMIT instead of changing the
> >       type (Yonghong)
> >   * Patch 2
> >     * Don't do unnecessary BTF_TYPE_EMIT (Yonghong)
> >     * Bump task refcount to prevent ->mm reuse (Yonghong)
> >     * Keep a pointer to vma_iterator in bpf_iter_task_vma, alloc/free
> >       via BPF mem allocator (Yonghong, Stanislav)
> >   * Patch 3
> >
> >   [0]: https://lore.kernel.org/bpf/20230801145414.418145-1-davemarchevsky@fb.com/
> >
> > Dave Marchevsky (5):
> >   bpf: Don't explicitly emit BTF for struct btf_iter_num
> >   selftests/bpf: Rename bpf_iter_task_vma.c to bpf_iter_task_vmas.c
> >   bpf: Introduce task_vma open-coded iterator kfuncs
> >   selftests/bpf: Add tests for open-coded task_vma iter
> >   bpf: Add BPF_KFUNC_{START,END}_defs macros

I dropped patch #5 from the series and applied the rest, thanks.

For BPF_KFUNC_{START,END}_DEFS, I like the clean up, but I feel like
we might want to follow similar function-call like syntax as with
__diag_push() and make sure we use semicolon to terminate their
"invocations". If you agree, please adjust and submit separately, I'd
be curious to see what others think.

Also, keep in mind that __diag_ignore_all("-Wmissing-prototypes") that
we use right now might not be enough. We started getting more and more
reports with different warnings recently, so we probably would need to
ignore more warnings for kfuncs:

kernel/bpf/helpers.c:1906:19: warning: no previous declaration for
'bpf_percpu_obj_new_impl' [-Wmissing-declarations]
kernel/bpf/helpers.c:1942:18: warning: no previous declaration for
'bpf_percpu_obj_drop_impl' [-Wmissing-declarations]
kernel/bpf/helpers.c:2477:18: warning: no previous declaration for
'bpf_throw' [-Wmissing-declarations]

Which just means that this refactoring is even more important and timely :)


> >
> >  include/linux/btf.h                           |  7 ++
> >  kernel/bpf/bpf_iter.c                         |  8 +-
> >  kernel/bpf/cpumask.c                          |  6 +-
> >  kernel/bpf/helpers.c                          |  9 +-
> >  kernel/bpf/map_iter.c                         |  6 +-
> >  kernel/bpf/task_iter.c                        | 89 +++++++++++++++++++
> >  kernel/trace/bpf_trace.c                      |  6 +-
> >  net/bpf/test_run.c                            |  7 +-
> >  net/core/filter.c                             | 13 +--
> >  net/core/xdp.c                                |  6 +-
> >  net/ipv4/fou_bpf.c                            |  6 +-
> >  net/netfilter/nf_conntrack_bpf.c              |  6 +-
> >  net/netfilter/nf_nat_bpf.c                    |  6 +-
> >  net/xfrm/xfrm_interface_bpf.c                 |  6 +-
> >  .../testing/selftests/bpf/bpf_experimental.h  |  8 ++
> >  .../selftests/bpf/prog_tests/bpf_iter.c       | 26 +++---
> >  .../testing/selftests/bpf/prog_tests/iters.c  | 58 ++++++++++++
> >  ...f_iter_task_vma.c => bpf_iter_task_vmas.c} |  0
> >  .../selftests/bpf/progs/iters_task_vma.c      | 43 +++++++++
> >  19 files changed, 248 insertions(+), 68 deletions(-)
> >  rename tools/testing/selftests/bpf/progs/{bpf_iter_task_vma.c => bpf_iter_task_vmas.c} (100%)
> >  create mode 100644 tools/testing/selftests/bpf/progs/iters_task_vma.c
> >
patchwork-bot+netdevbpf@kernel.org Oct. 13, 2023, 11 p.m. UTC | #3
Hello:

This series was applied to bpf/bpf-next.git (master)
by Andrii Nakryiko <andrii@kernel.org>:

On Fri, 13 Oct 2023 13:44:21 -0700 you wrote:
> At Meta we have a profiling daemon which periodically collects
> information on many hosts. This collection usually involves grabbing
> stacks (user and kernel) using perf_event BPF progs and later symbolicating
> them. For user stacks we try to use BPF_F_USER_BUILD_ID and rely on
> remote symbolication, but BPF_F_USER_BUILD_ID doesn't always succeed. In
> those cases we must fall back to digging around in /proc/PID/maps to map
> virtual address to (binary, offset). The /proc/PID/maps digging does not
> occur synchronously with stack collection, so the process might already
> be gone, in which case it won't have /proc/PID/maps and we will fail to
> symbolicate.
> 
> [...]

Here is the summary with links:
  - [v7,bpf-next,1/5] bpf: Don't explicitly emit BTF for struct btf_iter_num
    https://git.kernel.org/bpf/bpf-next/c/f10ca5da5bd7
  - [v7,bpf-next,2/5] selftests/bpf: Rename bpf_iter_task_vma.c to bpf_iter_task_vmas.c
    https://git.kernel.org/bpf/bpf-next/c/45b38941c81f
  - [v7,bpf-next,3/5] bpf: Introduce task_vma open-coded iterator kfuncs
    https://git.kernel.org/bpf/bpf-next/c/4ac454682158
  - [v7,bpf-next,4/5] selftests/bpf: Add tests for open-coded task_vma iter
    https://git.kernel.org/bpf/bpf-next/c/e0e1a7a5fc37
  - [v7,bpf-next,5/5] bpf: Add BPF_KFUNC_{START,END}_defs macros
    (no matching commit)

You are awesome, thank you!