mbox series

[v7,bpf-next,0/3] introduce bpf_iter for task_vma

Message ID 20210212183107.50963-1-songliubraving@fb.com (mailing list archive)
Headers show
Series introduce bpf_iter for task_vma | expand

Message

Song Liu Feb. 12, 2021, 6:31 p.m. UTC
This set introduces bpf_iter for task_vma, which can be used to generate
information similar to /proc/pid/maps. Patch 4/4 adds an example that
mimics /proc/pid/maps.

Current /proc/<pid>/maps and /proc/<pid>/smaps provide information of
vma's of a process. However, these information are not flexible enough to
cover all use cases. For example, if a vma cover mixed 2MB pages and 4kB
pages (x86_64), there is no easy way to tell which address ranges are
backed by 2MB pages. task_vma solves the problem by enabling the user to
generate customize information based on the vma (and vma->vm_mm,
vma->vm_file, etc.).

Changes v6 => v7:
  1. Let BPF iter program use bpf_d_path without specifying sleepable.
     (Alexei)

Changes v5 => v6:
  1. Add more comments for task_vma_seq_get_next() to explain the logic
     of find_vma() calls. (Alexei)
  2. Skip vma found by find_vma() when both vm_start and vm_end matches
     prev_vm_[start|end]. Previous versions only compares vm_start.
     IOW, if vma of [4k, 8k] is replaced by [4k, 12k] after relocking
     mmap_lock, v5 will skip the new vma, while v6 will process it.

Changes v4 => v5:
  1. Fix a refcount leak on task_struct. (Yonghong)
  2. Fix the selftest. (Yonghong)

Changes v3 => v4:
  1. Avoid skipping vma by assigning invalid prev_vm_start in
     task_vma_seq_stop(). (Yonghong)
  2. Move "again" label in task_vma_seq_get_next() save a check. (Yonghong)

Changes v2 => v3:
  1. Rewrite 1/4 so that we hold mmap_lock while calling BPF program. This
     enables the BPF program to access the real vma with BTF. (Alexei)
  2. Fix the logic when the control is returned to user space. (Yonghong)
  3. Revise commit log and cover letter. (Yonghong)

Changes v1 => v2:
  1. Small fixes in task_iter.c and the selftests. (Yonghong)

Song Liu (3):
  bpf: introduce task_vma bpf_iter
  bpf: allow bpf_d_path in bpf_iter program
  selftests/bpf: add test for bpf_iter_task_vma

 kernel/bpf/task_iter.c                        | 267 +++++++++++++++++-
 kernel/trace/bpf_trace.c                      |   4 +
 .../selftests/bpf/prog_tests/bpf_iter.c       | 118 +++++++-
 tools/testing/selftests/bpf/progs/bpf_iter.h  |   8 +
 .../selftests/bpf/progs/bpf_iter_task_vma.c   |  58 ++++
 5 files changed, 444 insertions(+), 11 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/progs/bpf_iter_task_vma.c

--
2.24.1

Comments

Alexei Starovoitov Feb. 12, 2021, 9:04 p.m. UTC | #1
On Fri, Feb 12, 2021 at 10:31 AM Song Liu <songliubraving@fb.com> wrote:
>
> This set introduces bpf_iter for task_vma, which can be used to generate
> information similar to /proc/pid/maps. Patch 4/4 adds an example that
> mimics /proc/pid/maps.
>
> Current /proc/<pid>/maps and /proc/<pid>/smaps provide information of
> vma's of a process. However, these information are not flexible enough to
> cover all use cases. For example, if a vma cover mixed 2MB pages and 4kB
> pages (x86_64), there is no easy way to tell which address ranges are
> backed by 2MB pages. task_vma solves the problem by enabling the user to
> generate customize information based on the vma (and vma->vm_mm,
> vma->vm_file, etc.).
>
> Changes v6 => v7:
>   1. Let BPF iter program use bpf_d_path without specifying sleepable.
>      (Alexei)

Applied. Thanks!