From patchwork Thu Nov 4 21:31:37 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Liu X-Patchwork-Id: 12603809 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0A8DCC433F5 for ; Thu, 4 Nov 2021 21:31:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E1B956120F for ; Thu, 4 Nov 2021 21:31:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232133AbhKDVea (ORCPT ); Thu, 4 Nov 2021 17:34:30 -0400 Received: from mx0a-00082601.pphosted.com ([67.231.145.42]:27246 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231857AbhKDVe3 (ORCPT ); Thu, 4 Nov 2021 17:34:29 -0400 Received: from pps.filterd (m0044010.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 1A4IQ9XJ000412 for ; Thu, 4 Nov 2021 14:31:50 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=facebook; bh=x82OZ9ql/z9n3zpJ8RYpbInj7pFXqRKa6EKsWtWMdQw=; b=P1PF99kWrnMH0y7aMB71+iXs1sChQZHii2Ca+yECZqd9LuG4Wnm+11EJhLGNC76Z7q7T Pl/v3J9XQfndfi0ufywT0f4tJgV/MdU1mblWIcYsdE3JiYEt9rDpjfeZH6IY5OtkOIGh wlklkgWar+qHK/8xsqV8kEsqY4j4oqyuYiY= Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com with ESMTP id 3c46b67t5q-7 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Thu, 04 Nov 2021 14:31:50 -0700 Received: from intmgw001.38.frc1.facebook.com (2620:10d:c0a8:1b::d) by mail.thefacebook.com (2620:10d:c0a8:83::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.14; Thu, 4 Nov 2021 14:31:47 -0700 Received: by devbig006.ftw2.facebook.com (Postfix, from userid 4523) id 162611DEB6413; Thu, 4 Nov 2021 14:31:43 -0700 (PDT) From: Song Liu To: , CC: , , , , , Song Liu , kernel test robot Subject: [PATCH v4 bpf-next 1/2] bpf: introduce helper bpf_find_vma Date: Thu, 4 Nov 2021 14:31:37 -0700 Message-ID: <20211104213138.2779620-2-songliubraving@fb.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20211104213138.2779620-1-songliubraving@fb.com> References: <20211104213138.2779620-1-songliubraving@fb.com> MIME-Version: 1.0 X-FB-Internal: Safe X-FB-Source: Intern X-Proofpoint-ORIG-GUID: 9piEfuR181OJSIhJ0UZUGI_H9H3WX4PJ X-Proofpoint-GUID: 9piEfuR181OJSIhJ0UZUGI_H9H3WX4PJ X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.790,Hydra:6.0.425,FMLib:17.0.607.475 definitions=2021-11-04_07,2021-11-03_01,2020-04-07_01 X-Proofpoint-Spam-Details: rule=fb_default_notspam policy=fb_default score=0 clxscore=1015 spamscore=0 impostorscore=0 suspectscore=0 priorityscore=1501 mlxscore=0 malwarescore=0 adultscore=0 phishscore=0 bulkscore=0 mlxlogscore=999 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2111040082 X-FB-Internal: deliver Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net In some profiler use cases, it is necessary to map an address to the backing file, e.g., a shared library. bpf_find_vma helper provides a flexible way to achieve this. bpf_find_vma maps an address of a task to the vma (vm_area_struct) for this address, and feed the vma to an callback BPF function. The callback function is necessary here, as we need to ensure mmap_sem is unlocked. It is necessary to lock mmap_sem for find_vma. To lock and unlock mmap_sem safely when irqs are disable, we use the same mechanism as stackmap with build_id. Specifically, when irqs are disabled, the unlocked is postponed in an irq_work. Refactor stackmap.c so that the irq_work is shared among bpf_find_vma and stackmap helpers. Reported-by: kernel test robot Signed-off-by: Song Liu Acked-by: Yonghong Song Tested-by: Hengqi Chen --- include/linux/bpf.h | 1 + include/uapi/linux/bpf.h | 20 +++++++++ kernel/bpf/btf.c | 5 ++- kernel/bpf/mmap_unlock_work.h | 65 +++++++++++++++++++++++++++ kernel/bpf/stackmap.c | 80 +++------------------------------- kernel/bpf/task_iter.c | 76 +++++++++++++++++++++++++++++--- kernel/bpf/verifier.c | 34 +++++++++++++++ kernel/trace/bpf_trace.c | 2 + tools/include/uapi/linux/bpf.h | 20 +++++++++ 9 files changed, 222 insertions(+), 81 deletions(-) create mode 100644 kernel/bpf/mmap_unlock_work.h diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 2be6dfd68df99..df3410bff4b06 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -2157,6 +2157,7 @@ extern const struct bpf_func_proto bpf_btf_find_by_name_kind_proto; extern const struct bpf_func_proto bpf_sk_setsockopt_proto; extern const struct bpf_func_proto bpf_sk_getsockopt_proto; extern const struct bpf_func_proto bpf_kallsyms_lookup_name_proto; +extern const struct bpf_func_proto bpf_find_vma_proto; const struct bpf_func_proto *tracing_prog_func_proto( enum bpf_func_id func_id, const struct bpf_prog *prog); diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index ba5af15e25f5c..509eee5f0393d 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -4938,6 +4938,25 @@ union bpf_attr { * **-ENOENT** if symbol is not found. * * **-EPERM** if caller does not have permission to obtain kernel address. + * + * long bpf_find_vma(struct task_struct *task, u64 addr, void *callback_fn, void *callback_ctx, u64 flags) + * Description + * Find vma of *task* that contains *addr*, call *callback_fn* + * function with *task*, *vma*, and *callback_ctx*. + * The *callback_fn* should be a static function and + * the *callback_ctx* should be a pointer to the stack. + * The *flags* is used to control certain aspects of the helper. + * Currently, the *flags* must be 0. + * + * The expected callback signature is + * + * long (\*callback_fn)(struct task_struct \*task, struct vm_area_struct \*vma, void \*callback_ctx); + * + * Return + * 0 on success. + * **-ENOENT** if *task->mm* is NULL, or no vma contains *addr*. + * **-EBUSY** if failed to try lock mmap_lock. + * **-EINVAL** for invalid **flags**. */ #define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -5120,6 +5139,7 @@ union bpf_attr { FN(trace_vprintk), \ FN(skc_to_unix_sock), \ FN(kallsyms_lookup_name), \ + FN(find_vma), \ /* */ /* integer value in 'imm' field of BPF_CALL instruction selects which helper diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index dbc3ad07e21b6..cdb0fba656006 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -6342,7 +6342,10 @@ const struct bpf_func_proto bpf_btf_find_by_name_kind_proto = { .arg4_type = ARG_ANYTHING, }; -BTF_ID_LIST_GLOBAL_SINGLE(btf_task_struct_ids, struct, task_struct) +BTF_ID_LIST_GLOBAL(btf_task_struct_ids) +BTF_ID(struct, task_struct) +BTF_ID(struct, file) +BTF_ID(struct, vm_area_struct) /* BTF ID set registration API for modules */ diff --git a/kernel/bpf/mmap_unlock_work.h b/kernel/bpf/mmap_unlock_work.h new file mode 100644 index 0000000000000..5d18d7d85bef9 --- /dev/null +++ b/kernel/bpf/mmap_unlock_work.h @@ -0,0 +1,65 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* Copyright (c) 2021 Facebook + */ + +#ifndef __MMAP_UNLOCK_WORK_H__ +#define __MMAP_UNLOCK_WORK_H__ +#include + +/* irq_work to run mmap_read_unlock() in irq_work */ +struct mmap_unlock_irq_work { + struct irq_work irq_work; + struct mm_struct *mm; +}; + +DECLARE_PER_CPU(struct mmap_unlock_irq_work, mmap_unlock_work); + +/* + * We cannot do mmap_read_unlock() when the irq is disabled, because of + * risk to deadlock with rq_lock. To look up vma when the irqs are + * disabled, we need to run mmap_read_unlock() in irq_work. We use a + * percpu variable to do the irq_work. If the irq_work is already used + * by another lookup, we fall over. + */ +static inline bool bpf_mmap_unlock_get_irq_work(struct mmap_unlock_irq_work **work_ptr) +{ + struct mmap_unlock_irq_work *work = NULL; + bool irq_work_busy = false; + + if (irqs_disabled()) { + if (!IS_ENABLED(CONFIG_PREEMPT_RT)) { + work = this_cpu_ptr(&mmap_unlock_work); + if (irq_work_is_busy(&work->irq_work)) { + /* cannot queue more up_read, fallback */ + irq_work_busy = true; + } + } else { + /* + * PREEMPT_RT does not allow to trylock mmap sem in + * interrupt disabled context. Force the fallback code. + */ + irq_work_busy = true; + } + } + + *work_ptr = work; + return irq_work_busy; +} + +static inline void bpf_mmap_unlock_mm(struct mmap_unlock_irq_work *work, struct mm_struct *mm) +{ + if (!work) { + mmap_read_unlock(mm); + } else { + work->mm = mm; + + /* The lock will be released once we're out of interrupt + * context. Tell lockdep that we've released it now so + * it doesn't complain that we forgot to release it. + */ + rwsem_release(&mm->mmap_lock.dep_map, _RET_IP_); + irq_work_queue(&work->irq_work); + } +} + +#endif /* __MMAP_UNLOCK_WORK_H__ */ diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c index 6e75bbee39f0b..1de0a1b03636e 100644 --- a/kernel/bpf/stackmap.c +++ b/kernel/bpf/stackmap.c @@ -7,10 +7,10 @@ #include #include #include -#include #include #include #include "percpu_freelist.h" +#include "mmap_unlock_work.h" #define STACK_CREATE_FLAG_MASK \ (BPF_F_NUMA_NODE | BPF_F_RDONLY | BPF_F_WRONLY | \ @@ -31,25 +31,6 @@ struct bpf_stack_map { struct stack_map_bucket *buckets[]; }; -/* irq_work to run up_read() for build_id lookup in nmi context */ -struct stack_map_irq_work { - struct irq_work irq_work; - struct mm_struct *mm; -}; - -static void do_up_read(struct irq_work *entry) -{ - struct stack_map_irq_work *work; - - if (WARN_ON_ONCE(IS_ENABLED(CONFIG_PREEMPT_RT))) - return; - - work = container_of(entry, struct stack_map_irq_work, irq_work); - mmap_read_unlock_non_owner(work->mm); -} - -static DEFINE_PER_CPU(struct stack_map_irq_work, up_read_work); - static inline bool stack_map_use_build_id(struct bpf_map *map) { return (map->map_flags & BPF_F_STACK_BUILD_ID); @@ -149,35 +130,13 @@ static void stack_map_get_build_id_offset(struct bpf_stack_build_id *id_offs, u64 *ips, u32 trace_nr, bool user) { int i; + struct mmap_unlock_irq_work *work = NULL; + bool irq_work_busy = bpf_mmap_unlock_get_irq_work(&work); struct vm_area_struct *vma; - bool irq_work_busy = false; - struct stack_map_irq_work *work = NULL; - - if (irqs_disabled()) { - if (!IS_ENABLED(CONFIG_PREEMPT_RT)) { - work = this_cpu_ptr(&up_read_work); - if (irq_work_is_busy(&work->irq_work)) { - /* cannot queue more up_read, fallback */ - irq_work_busy = true; - } - } else { - /* - * PREEMPT_RT does not allow to trylock mmap sem in - * interrupt disabled context. Force the fallback code. - */ - irq_work_busy = true; - } - } - /* - * We cannot do up_read() when the irq is disabled, because of - * risk to deadlock with rq_lock. To do build_id lookup when the - * irqs are disabled, we need to run up_read() in irq_work. We use - * a percpu variable to do the irq_work. If the irq_work is - * already used by another lookup, we fall back to report ips. - * - * Same fallback is used for kernel stack (!user) on a stackmap - * with build_id. + /* If the irq_work is in use, fall back to report ips. Same + * fallback is used for kernel stack (!user) on a stackmap with + * build_id. */ if (!user || !current || !current->mm || irq_work_busy || !mmap_read_trylock(current->mm)) { @@ -203,19 +162,7 @@ static void stack_map_get_build_id_offset(struct bpf_stack_build_id *id_offs, - vma->vm_start; id_offs[i].status = BPF_STACK_BUILD_ID_VALID; } - - if (!work) { - mmap_read_unlock(current->mm); - } else { - work->mm = current->mm; - - /* The lock will be released once we're out of interrupt - * context. Tell lockdep that we've released it now so - * it doesn't complain that we forgot to release it. - */ - rwsem_release(¤t->mm->mmap_lock.dep_map, _RET_IP_); - irq_work_queue(&work->irq_work); - } + bpf_mmap_unlock_mm(work, current->mm); } static struct perf_callchain_entry * @@ -719,16 +666,3 @@ const struct bpf_map_ops stack_trace_map_ops = { .map_btf_name = "bpf_stack_map", .map_btf_id = &stack_trace_map_btf_id, }; - -static int __init stack_map_init(void) -{ - int cpu; - struct stack_map_irq_work *work; - - for_each_possible_cpu(cpu) { - work = per_cpu_ptr(&up_read_work, cpu); - init_irq_work(&work->irq_work, do_up_read); - } - return 0; -} -subsys_initcall(stack_map_init); diff --git a/kernel/bpf/task_iter.c b/kernel/bpf/task_iter.c index b48750bfba5aa..f171479f7dd6b 100644 --- a/kernel/bpf/task_iter.c +++ b/kernel/bpf/task_iter.c @@ -8,6 +8,7 @@ #include #include #include +#include "mmap_unlock_work.h" struct bpf_iter_seq_task_common { struct pid_namespace *ns; @@ -524,10 +525,6 @@ static const struct seq_operations task_vma_seq_ops = { .show = task_vma_seq_show, }; -BTF_ID_LIST(btf_task_file_ids) -BTF_ID(struct, file) -BTF_ID(struct, vm_area_struct) - static const struct bpf_iter_seq_info task_seq_info = { .seq_ops = &task_seq_ops, .init_seq_private = init_seq_pidns, @@ -586,9 +583,74 @@ static struct bpf_iter_reg task_vma_reg_info = { .seq_info = &task_vma_seq_info, }; +BPF_CALL_5(bpf_find_vma, struct task_struct *, task, u64, start, + bpf_callback_t, callback_fn, void *, callback_ctx, u64, flags) +{ + struct mmap_unlock_irq_work *work = NULL; + struct vm_area_struct *vma; + bool irq_work_busy = false; + struct mm_struct *mm; + int ret = -ENOENT; + + if (flags) + return -EINVAL; + + if (!task) + return -ENOENT; + + mm = task->mm; + if (!mm) + return -ENOENT; + + irq_work_busy = bpf_mmap_unlock_get_irq_work(&work); + + if (irq_work_busy || !mmap_read_trylock(mm)) + return -EBUSY; + + vma = find_vma(mm, start); + + if (vma && vma->vm_start <= start && vma->vm_end > start) { + callback_fn((u64)(long)task, (u64)(long)vma, + (u64)(long)callback_ctx, 0, 0); + ret = 0; + } + bpf_mmap_unlock_mm(work, mm); + return ret; +} + +const struct bpf_func_proto bpf_find_vma_proto = { + .func = bpf_find_vma, + .ret_type = RET_INTEGER, + .arg1_type = ARG_PTR_TO_BTF_ID, + .arg1_btf_id = &btf_task_struct_ids[0], + .arg2_type = ARG_ANYTHING, + .arg3_type = ARG_PTR_TO_FUNC, + .arg4_type = ARG_PTR_TO_STACK_OR_NULL, + .arg5_type = ARG_ANYTHING, +}; + +DEFINE_PER_CPU(struct mmap_unlock_irq_work, mmap_unlock_work); + +static void do_mmap_read_unlock(struct irq_work *entry) +{ + struct mmap_unlock_irq_work *work; + + if (WARN_ON_ONCE(IS_ENABLED(CONFIG_PREEMPT_RT))) + return; + + work = container_of(entry, struct mmap_unlock_irq_work, irq_work); + mmap_read_unlock_non_owner(work->mm); +} + static int __init task_iter_init(void) { - int ret; + struct mmap_unlock_irq_work *work; + int ret, cpu; + + for_each_possible_cpu(cpu) { + work = per_cpu_ptr(&mmap_unlock_work, cpu); + init_irq_work(&work->irq_work, do_mmap_read_unlock); + } task_reg_info.ctx_arg_info[0].btf_id = btf_task_struct_ids[0]; ret = bpf_iter_reg_target(&task_reg_info); @@ -596,13 +658,13 @@ static int __init task_iter_init(void) return ret; task_file_reg_info.ctx_arg_info[0].btf_id = btf_task_struct_ids[0]; - task_file_reg_info.ctx_arg_info[1].btf_id = btf_task_file_ids[0]; + task_file_reg_info.ctx_arg_info[1].btf_id = btf_task_struct_ids[1]; ret = bpf_iter_reg_target(&task_file_reg_info); if (ret) return ret; task_vma_reg_info.ctx_arg_info[0].btf_id = btf_task_struct_ids[0]; - task_vma_reg_info.ctx_arg_info[1].btf_id = btf_task_file_ids[1]; + task_vma_reg_info.ctx_arg_info[1].btf_id = btf_task_struct_ids[2]; return bpf_iter_reg_target(&task_vma_reg_info); } late_initcall(task_iter_init); diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index f0dca726ebfde..1aafb43f61d1c 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -6132,6 +6132,33 @@ static int set_timer_callback_state(struct bpf_verifier_env *env, return 0; } +static int set_find_vma_callback_state(struct bpf_verifier_env *env, + struct bpf_func_state *caller, + struct bpf_func_state *callee, + int insn_idx) +{ + /* bpf_find_vma(struct task_struct *task, u64 addr, + * void *callback_fn, void *callback_ctx, u64 flags) + * (callback_fn)(struct task_struct *task, + * struct vm_area_struct *vma, void *callback_ctx); + */ + callee->regs[BPF_REG_1] = caller->regs[BPF_REG_1]; + + callee->regs[BPF_REG_2].type = PTR_TO_BTF_ID; + __mark_reg_known_zero(&callee->regs[BPF_REG_2]); + callee->regs[BPF_REG_2].btf = btf_vmlinux; + callee->regs[BPF_REG_2].btf_id = btf_task_struct_ids[2]; + + /* pointer to stack or null */ + callee->regs[BPF_REG_3] = caller->regs[BPF_REG_4]; + + /* unused */ + __mark_reg_not_init(env, &callee->regs[BPF_REG_4]); + __mark_reg_not_init(env, &callee->regs[BPF_REG_5]); + callee->in_callback_fn = true; + return 0; +} + static int prepare_func_exit(struct bpf_verifier_env *env, int *insn_idx) { struct bpf_verifier_state *state = env->cur_state; @@ -6489,6 +6516,13 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn return -EINVAL; } + if (func_id == BPF_FUNC_find_vma) { + err = __check_func_call(env, insn, insn_idx_p, meta.subprogno, + set_find_vma_callback_state); + if (err < 0) + return -EINVAL; + } + if (func_id == BPF_FUNC_snprintf) { err = check_bpf_snprintf_call(env, regs); if (err < 0) diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index 7396488793ff7..390176a3031ab 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -1208,6 +1208,8 @@ bpf_tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) return &bpf_get_func_ip_proto_tracing; case BPF_FUNC_get_branch_snapshot: return &bpf_get_branch_snapshot_proto; + case BPF_FUNC_find_vma: + return &bpf_find_vma_proto; case BPF_FUNC_trace_vprintk: return bpf_get_trace_vprintk_proto(); default: diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index ba5af15e25f5c..509eee5f0393d 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -4938,6 +4938,25 @@ union bpf_attr { * **-ENOENT** if symbol is not found. * * **-EPERM** if caller does not have permission to obtain kernel address. + * + * long bpf_find_vma(struct task_struct *task, u64 addr, void *callback_fn, void *callback_ctx, u64 flags) + * Description + * Find vma of *task* that contains *addr*, call *callback_fn* + * function with *task*, *vma*, and *callback_ctx*. + * The *callback_fn* should be a static function and + * the *callback_ctx* should be a pointer to the stack. + * The *flags* is used to control certain aspects of the helper. + * Currently, the *flags* must be 0. + * + * The expected callback signature is + * + * long (\*callback_fn)(struct task_struct \*task, struct vm_area_struct \*vma, void \*callback_ctx); + * + * Return + * 0 on success. + * **-ENOENT** if *task->mm* is NULL, or no vma contains *addr*. + * **-EBUSY** if failed to try lock mmap_lock. + * **-EINVAL** for invalid **flags**. */ #define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -5120,6 +5139,7 @@ union bpf_attr { FN(trace_vprintk), \ FN(skc_to_unix_sock), \ FN(kallsyms_lookup_name), \ + FN(find_vma), \ /* */ /* integer value in 'imm' field of BPF_CALL instruction selects which helper From patchwork Thu Nov 4 21:31:38 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Liu X-Patchwork-Id: 12603807 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0A2B7C433F5 for ; Thu, 4 Nov 2021 21:31:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E326B61252 for ; Thu, 4 Nov 2021 21:31:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232040AbhKDVe1 (ORCPT ); Thu, 4 Nov 2021 17:34:27 -0400 Received: from mx0b-00082601.pphosted.com ([67.231.153.30]:59504 "EHLO mx0b-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230162AbhKDVe0 (ORCPT ); Thu, 4 Nov 2021 17:34:26 -0400 Received: from pps.filterd (m0109332.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 1A4JKar0001465 for ; Thu, 4 Nov 2021 14:31:47 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=facebook; bh=H0jTJ/DFOuquvWipluSCQ8Qew94aHzkjKxLOev3YpLQ=; b=FbjjphT4heS1GJATptMkQ1Do8CColDbkjTNuEN+GyEZjRgNZiWRAhh+MQX6bXLsF7tG3 W6zfUBLY7O0YLFQnDHM/PXVgoTXfmJxa1nyEfbPaPapAXhylvXZHF/HWrbKqAXIb78WS jmPu+n7pCmt/rWa35QNzd1cdc/iBLJkCriU= Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com with ESMTP id 3c4d5un2tr-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Thu, 04 Nov 2021 14:31:47 -0700 Received: from intmgw001.37.frc1.facebook.com (2620:10d:c0a8:1b::d) by mail.thefacebook.com (2620:10d:c0a8:82::e) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.14; Thu, 4 Nov 2021 14:31:46 -0700 Received: by devbig006.ftw2.facebook.com (Postfix, from userid 4523) id DFB691DEB6419; Thu, 4 Nov 2021 14:31:44 -0700 (PDT) From: Song Liu To: , CC: , , , , , Song Liu Subject: [PATCH v4 bpf-next 2/2] selftests/bpf: add tests for bpf_find_vma Date: Thu, 4 Nov 2021 14:31:38 -0700 Message-ID: <20211104213138.2779620-3-songliubraving@fb.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20211104213138.2779620-1-songliubraving@fb.com> References: <20211104213138.2779620-1-songliubraving@fb.com> MIME-Version: 1.0 X-FB-Internal: Safe X-FB-Source: Intern X-Proofpoint-GUID: eVZi2pZv2Pku2To3S55xHe22JX7jpn0j X-Proofpoint-ORIG-GUID: eVZi2pZv2Pku2To3S55xHe22JX7jpn0j X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.790,Hydra:6.0.425,FMLib:17.0.607.475 definitions=2021-11-04_07,2021-11-03_01,2020-04-07_01 X-Proofpoint-Spam-Details: rule=fb_default_notspam policy=fb_default score=0 impostorscore=0 priorityscore=1501 spamscore=0 clxscore=1015 malwarescore=0 bulkscore=0 adultscore=0 suspectscore=0 mlxlogscore=999 phishscore=0 mlxscore=0 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2111040082 X-FB-Internal: deliver Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: bpf@iogearbox.net Add tests for bpf_find_vma in perf_event program and kprobe program. The perf_event program is triggered from NMI context, so the second call of bpf_find_vma() will return -EBUSY (irq_work busy). The kprobe program, on the other hand, does not have this constraint. Also add tests for illegal writes to task or vma from the callback function. The verifier should reject both cases. Signed-off-by: Song Liu Acked-by: Yonghong Song --- .../selftests/bpf/prog_tests/find_vma.c | 115 ++++++++++++++++++ tools/testing/selftests/bpf/progs/find_vma.c | 69 +++++++++++ .../selftests/bpf/progs/find_vma_fail1.c | 29 +++++ .../selftests/bpf/progs/find_vma_fail2.c | 29 +++++ 4 files changed, 242 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/find_vma.c create mode 100644 tools/testing/selftests/bpf/progs/find_vma.c create mode 100644 tools/testing/selftests/bpf/progs/find_vma_fail1.c create mode 100644 tools/testing/selftests/bpf/progs/find_vma_fail2.c diff --git a/tools/testing/selftests/bpf/prog_tests/find_vma.c b/tools/testing/selftests/bpf/prog_tests/find_vma.c new file mode 100644 index 0000000000000..d081edbc9cae3 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/find_vma.c @@ -0,0 +1,115 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2021 Facebook */ +#include +#include +#include +#include "find_vma.skel.h" +#include "find_vma_fail1.skel.h" +#include "find_vma_fail2.skel.h" + +static void test_and_reset_skel(struct find_vma *skel, int expected_find_zero_ret) +{ + ASSERT_EQ(skel->bss->found_vm_exec, 1, "found_vm_exec"); + ASSERT_EQ(skel->data->find_addr_ret, 0, "find_addr_ret"); + ASSERT_EQ(skel->data->find_zero_ret, expected_find_zero_ret, "find_zero_ret"); + ASSERT_OK_PTR(strstr(skel->bss->d_iname, "test_progs"), "find_test_progs"); + + skel->bss->found_vm_exec = 0; + skel->data->find_addr_ret = -1; + skel->data->find_zero_ret = -1; + skel->bss->d_iname[0] = 0; +} + +static int open_pe(void) +{ + struct perf_event_attr attr = {0}; + int pfd; + + /* create perf event */ + attr.size = sizeof(attr); + attr.type = PERF_TYPE_HARDWARE; + attr.config = PERF_COUNT_HW_CPU_CYCLES; + attr.freq = 1; + attr.sample_freq = 4000; + pfd = syscall(__NR_perf_event_open, &attr, 0, -1, -1, PERF_FLAG_FD_CLOEXEC); + + return pfd >= 0 ? pfd : -errno; +} + +static void test_find_vma_pe(struct find_vma *skel) +{ + struct bpf_link *link = NULL; + volatile int j = 0; + int pfd = -1, i; + + pfd = open_pe(); + if (pfd < 0) { + if (pfd == -ENOENT || pfd == -EOPNOTSUPP) { + printf("%s:SKIP:no PERF_COUNT_HW_CPU_CYCLES\n", __func__); + test__skip(); + goto cleanup; + } + if (!ASSERT_GE(pfd, 0, "perf_event_open")) + goto cleanup; + } + + link = bpf_program__attach_perf_event(skel->progs.handle_pe, pfd); + if (!ASSERT_OK_PTR(link, "attach_perf_event")) + goto cleanup; + + for (i = 0; i < 1000000; ++i) + ++j; + + test_and_reset_skel(skel, -EBUSY /* in nmi, irq_work is busy */); +cleanup: + bpf_link__destroy(link); + close(pfd); +} + +static void test_find_vma_kprobe(struct find_vma *skel) +{ + int err; + + err = find_vma__attach(skel); + if (!ASSERT_OK(err, "get_branch_snapshot__attach")) + return; + + getpgid(skel->bss->target_pid); + test_and_reset_skel(skel, -ENOENT /* could not find vma for ptr 0 */); +} + +static void test_illegal_write_vma(void) +{ + struct find_vma_fail1 *skel; + + skel = find_vma_fail1__open_and_load(); + ASSERT_ERR_PTR(skel, "find_vma_fail1__open_and_load"); +} + +static void test_illegal_write_task(void) +{ + struct find_vma_fail2 *skel; + + skel = find_vma_fail2__open_and_load(); + ASSERT_ERR_PTR(skel, "find_vma_fail2__open_and_load"); +} + +void serial_test_find_vma(void) +{ + struct find_vma *skel; + + skel = find_vma__open_and_load(); + if (!ASSERT_OK_PTR(skel, "find_vma__open_and_load")) + return; + + skel->bss->target_pid = getpid(); + skel->bss->addr = (__u64)test_find_vma_pe; + + test_find_vma_pe(skel); + usleep(100000); /* allow the irq_work to finish */ + test_find_vma_kprobe(skel); + + find_vma__destroy(skel); + test_illegal_write_vma(); + test_illegal_write_task(); +} diff --git a/tools/testing/selftests/bpf/progs/find_vma.c b/tools/testing/selftests/bpf/progs/find_vma.c new file mode 100644 index 0000000000000..9451f0a047442 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/find_vma.c @@ -0,0 +1,69 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2021 Facebook */ +#include "vmlinux.h" +#include +#include + +char _license[] SEC("license") = "GPL"; + +struct callback_ctx { + int dummy; +}; + +#define VM_EXEC 0x00000004 +#define DNAME_INLINE_LEN 32 + +pid_t target_pid = 0; +char d_iname[DNAME_INLINE_LEN] = {0}; +__u32 found_vm_exec = 0; +__u64 addr = 0; +int find_zero_ret = -1; +int find_addr_ret = -1; + +static long check_vma(struct task_struct *task, struct vm_area_struct *vma, + struct callback_ctx *data) +{ + if (vma->vm_file) + bpf_probe_read_kernel_str(d_iname, DNAME_INLINE_LEN - 1, + vma->vm_file->f_path.dentry->d_iname); + + /* check for VM_EXEC */ + if (vma->vm_flags & VM_EXEC) + found_vm_exec = 1; + + return 0; +} + +SEC("raw_tp/sys_enter") +int handle_getpid(void) +{ + struct task_struct *task = bpf_get_current_task_btf(); + struct callback_ctx data = {0}; + + if (task->pid != target_pid) + return 0; + + find_addr_ret = bpf_find_vma(task, addr, check_vma, &data, 0); + + /* this should return -ENOENT */ + find_zero_ret = bpf_find_vma(task, 0, check_vma, &data, 0); + return 0; +} + +SEC("perf_event") +int handle_pe(void) +{ + struct task_struct *task = bpf_get_current_task_btf(); + struct callback_ctx data = {0}; + + if (task->pid != target_pid) + return 0; + + find_addr_ret = bpf_find_vma(task, addr, check_vma, &data, 0); + + /* In NMI, this should return -EBUSY, as the previous call is using + * the irq_work. + */ + find_zero_ret = bpf_find_vma(task, 0, check_vma, &data, 0); + return 0; +} diff --git a/tools/testing/selftests/bpf/progs/find_vma_fail1.c b/tools/testing/selftests/bpf/progs/find_vma_fail1.c new file mode 100644 index 0000000000000..6ecd1515f1221 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/find_vma_fail1.c @@ -0,0 +1,29 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2021 Facebook */ +#include "vmlinux.h" +#include + +char _license[] SEC("license") = "GPL"; + +struct callback_ctx { + int dummy; +}; + +static long write_vma(struct task_struct *task, struct vm_area_struct *vma, + struct callback_ctx *data) +{ + /* writing to vma, which is illegal */ + vma->vm_flags |= 0x55; + + return 0; +} + +SEC("raw_tp/sys_enter") +int handle_getpid(void) +{ + struct task_struct *task = bpf_get_current_task_btf(); + struct callback_ctx data = {0}; + + bpf_find_vma(task, 0, write_vma, &data, 0); + return 0; +} diff --git a/tools/testing/selftests/bpf/progs/find_vma_fail2.c b/tools/testing/selftests/bpf/progs/find_vma_fail2.c new file mode 100644 index 0000000000000..5ea93a67f6c8b --- /dev/null +++ b/tools/testing/selftests/bpf/progs/find_vma_fail2.c @@ -0,0 +1,29 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2021 Facebook */ +#include "vmlinux.h" +#include + +char _license[] SEC("license") = "GPL"; + +struct callback_ctx { + int dummy; +}; + +static long write_task(struct task_struct *task, struct vm_area_struct *vma, + struct callback_ctx *data) +{ + /* writing to task, which is illegal */ + task->mm = NULL; + + return 0; +} + +SEC("raw_tp/sys_enter") +int handle_getpid(void) +{ + struct task_struct *task = bpf_get_current_task_btf(); + struct callback_ctx data = {0}; + + bpf_find_vma(task, 0, write_task, &data, 0); + return 0; +}