From patchwork Thu Oct 17 22:31:43 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yonghong Song X-Patchwork-Id: 13840886 X-Patchwork-Delegate: bpf@iogearbox.net Received: from 66-220-155-178.mail-mxout.facebook.com (66-220-155-178.mail-mxout.facebook.com [66.220.155.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4AF8836AF5 for ; Thu, 17 Oct 2024 22:31:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=66.220.155.178 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729204320; cv=none; b=nPqY3mIipY9Zf5558p+efDNZFXCdrnk4rnwjd6wT6roJASCvSyJYObhms3BrMiM8Zq2QtO4gTNCtwnnrm09XLexwG6/4MO10GpdittaBcfN+E1IGx86E5RLCvZZkG9z/YkuB2aYQ1WBAPENbDMlPxI3gH9IzKSc9Afw52IFrDVk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729204320; c=relaxed/simple; bh=9294eHM76gJGtqSY3LeohChyWLrJMCfuhzM+aBWE9jY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ZKyA8MXKaiwVBUH29z0fw4gjtXY+uS8dbATuWCsGw0Pgyzzk0rUY3UKzr8KG1oqpJQOiuIk5F5xnc1/0p/nyqqF3b0oz9LCY/NvNinbjSXhZc5VkJM6FQd3IBDYAWncgSWe4Ckx7xnoH0au+kVP1JXF6+JENZyVQ2iZEJuzHFlU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.dev; spf=fail smtp.mailfrom=linux.dev; arc=none smtp.client-ip=66.220.155.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=linux.dev Received: by devbig309.ftw3.facebook.com (Postfix, from userid 128203) id B2BCFA2F079C; Thu, 17 Oct 2024 15:31:43 -0700 (PDT) From: Yonghong Song To: bpf@vger.kernel.org Cc: Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , kernel-team@fb.com, Martin KaFai Lau , Tejun Heo Subject: [PATCH bpf-next v5 1/9] bpf: Allow each subprog having stack size of 512 bytes Date: Thu, 17 Oct 2024 15:31:43 -0700 Message-ID: <20241017223143.3176192-1-yonghong.song@linux.dev> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20241017223138.3175885-1-yonghong.song@linux.dev> References: <20241017223138.3175885-1-yonghong.song@linux.dev> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net With private stack support, each subprog can have stack with up to 512 bytes. The limit of 512 bytes per subprog is kept to avoid increasing verifier complexity since greater than 512 bytes will cause big verifier change and increase memory consumption and verification time. If private stack is supported, for a bpf prog, esp. when it has subprogs, private stack will be allocated for the main prog and for each callback subprog. For example, main_prog subprog1 calling helper subprog10 (callback func) subprog11 subprog2 calling helper subprog10 (callback func) subprog11 Separate private allocations for main_prog and callback_fn subprog10 will make things easier since the helper function uses the kernel stack. In this patch, some tracing programs are allowed to use private stack since tracing prog may be triggered in the middle of some other prog runs. Additional subprog info is also collected for later to allocate private stack for main prog and each callback functions. Note that if any tail_call is called in the prog (including all subprogs), then private stack is not used. Signed-off-by: Yonghong Song --- include/linux/bpf.h | 1 + include/linux/bpf_verifier.h | 3 ++ include/linux/filter.h | 1 + kernel/bpf/core.c | 5 ++ kernel/bpf/verifier.c | 94 +++++++++++++++++++++++++++++++----- 5 files changed, 91 insertions(+), 13 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 0c216e71cec7..6ad8ace7075a 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1490,6 +1490,7 @@ struct bpf_prog_aux { bool exception_cb; bool exception_boundary; bool is_extended; /* true if extended by freplace program */ + bool priv_stack_eligible; u64 prog_array_member_cnt; /* counts how many times as member of prog_array */ struct mutex ext_mutex; /* mutex for is_extended and prog_array_member_cnt */ struct bpf_arena *arena; diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h index 4513372c5bc8..bcfe868e3801 100644 --- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -659,6 +659,8 @@ struct bpf_subprog_info { * are used for bpf_fastcall spills and fills. */ s16 fastcall_stack_off; + u16 subtree_stack_depth; + u16 subtree_top_idx; bool has_tail_call: 1; bool tail_call_reachable: 1; bool has_ld_abs: 1; @@ -668,6 +670,7 @@ struct bpf_subprog_info { bool args_cached: 1; /* true if bpf_fastcall stack region is used by functions that can't be inlined */ bool keep_fastcall_stack: 1; + bool priv_stack_eligible: 1; u8 arg_cnt; struct bpf_subprog_arg_info args[MAX_BPF_FUNC_REG_ARGS]; diff --git a/include/linux/filter.h b/include/linux/filter.h index 7d7578a8eac1..3a21947f2fd4 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -1119,6 +1119,7 @@ bool bpf_jit_supports_exceptions(void); bool bpf_jit_supports_ptr_xchg(void); bool bpf_jit_supports_arena(void); bool bpf_jit_supports_insn(struct bpf_insn *insn, bool in_arena); +bool bpf_jit_supports_private_stack(void); u64 bpf_arch_uaddress_limit(void); void arch_bpf_stack_walk(bool (*consume_fn)(void *cookie, u64 ip, u64 sp, u64 bp), void *cookie); bool bpf_helper_changes_pkt_data(void *func); diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index 233ea78f8f1b..14d9288441f2 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -3045,6 +3045,11 @@ bool __weak bpf_jit_supports_exceptions(void) return false; } +bool __weak bpf_jit_supports_private_stack(void) +{ + return false; +} + void __weak arch_bpf_stack_walk(bool (*consume_fn)(void *cookie, u64 ip, u64 sp, u64 bp), void *cookie) { } diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index f514247ba8ba..a12f5e823284 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -194,6 +194,8 @@ struct bpf_verifier_stack_elem { #define BPF_GLOBAL_PERCPU_MA_MAX_SIZE 512 +#define BPF_PRIV_STACK_MIN_SUBTREE_SIZE 128 + static int acquire_reference_state(struct bpf_verifier_env *env, int insn_idx); static int release_reference(struct bpf_verifier_env *env, int ref_obj_id); static void invalidate_non_owning_refs(struct bpf_verifier_env *env); @@ -5982,6 +5984,41 @@ static int check_ptr_alignment(struct bpf_verifier_env *env, strict); } +static bool bpf_enable_private_stack(struct bpf_verifier_env *env) +{ + if (!bpf_jit_supports_private_stack()) + return false; + + switch (env->prog->type) { + case BPF_PROG_TYPE_KPROBE: + case BPF_PROG_TYPE_TRACEPOINT: + case BPF_PROG_TYPE_PERF_EVENT: + case BPF_PROG_TYPE_RAW_TRACEPOINT: + return true; + case BPF_PROG_TYPE_TRACING: + if (env->prog->expected_attach_type != BPF_TRACE_ITER) + return true; + fallthrough; + default: + return false; + } +} + +static bool is_priv_stack_supported(struct bpf_verifier_env *env) +{ + struct bpf_subprog_info *si = env->subprog_info; + bool has_tail_call = false; + + for (int i = 0; i < env->subprog_cnt; i++) { + if (si[i].has_tail_call) { + has_tail_call = true; + break; + } + } + + return !has_tail_call && bpf_enable_private_stack(env); +} + static int round_up_stack_depth(struct bpf_verifier_env *env, int stack_depth) { if (env->prog->jit_requested) @@ -5999,16 +6036,21 @@ static int round_up_stack_depth(struct bpf_verifier_env *env, int stack_depth) * Since recursion is prevented by check_cfg() this algorithm * only needs a local stack of MAX_CALL_FRAMES to remember callsites */ -static int check_max_stack_depth_subprog(struct bpf_verifier_env *env, int idx) +static int check_max_stack_depth_subprog(struct bpf_verifier_env *env, int idx, + bool check_priv_stack, bool priv_stack_supported) { struct bpf_subprog_info *subprog = env->subprog_info; struct bpf_insn *insn = env->prog->insnsi; int depth = 0, frame = 0, i, subprog_end; bool tail_call_reachable = false; + bool priv_stack_eligible = false; int ret_insn[MAX_CALL_FRAMES]; int ret_prog[MAX_CALL_FRAMES]; - int j; + int j, subprog_stack_depth; + int orig_idx = idx; + if (check_priv_stack) + subprog[idx].subtree_top_idx = idx; i = subprog[idx].start; process_func: /* protect against potential stack overflow that might happen when @@ -6030,18 +6072,33 @@ static int check_max_stack_depth_subprog(struct bpf_verifier_env *env, int idx) * tailcall will unwind the current stack frame but it will not get rid * of caller's stack as shown on the example above. */ - if (idx && subprog[idx].has_tail_call && depth >= 256) { + if (!check_priv_stack && idx && subprog[idx].has_tail_call && depth >= 256) { verbose(env, "tail_calls are not allowed when call stack of previous frames is %d bytes. Too large\n", depth); return -EACCES; } - depth += round_up_stack_depth(env, subprog[idx].stack_depth); - if (depth > MAX_BPF_STACK) { + subprog_stack_depth = round_up_stack_depth(env, subprog[idx].stack_depth); + depth += subprog_stack_depth; + if (!check_priv_stack && !priv_stack_supported && depth > MAX_BPF_STACK) { verbose(env, "combined stack size of %d calls is %d. Too large\n", frame + 1, depth); return -EACCES; } + if (check_priv_stack) { + if (subprog_stack_depth > MAX_BPF_STACK) { + verbose(env, "stack size of subprog %d is %d. Too large\n", + idx, subprog_stack_depth); + return -EACCES; + } + + if (!priv_stack_eligible && depth >= BPF_PRIV_STACK_MIN_SUBTREE_SIZE) { + subprog[orig_idx].priv_stack_eligible = true; + env->prog->aux->priv_stack_eligible = priv_stack_eligible = true; + } + subprog[orig_idx].subtree_stack_depth = + max_t(u16, subprog[orig_idx].subtree_stack_depth, depth); + } continue_func: subprog_end = subprog[idx + 1].start; for (; i < subprog_end; i++) { @@ -6097,8 +6154,10 @@ static int check_max_stack_depth_subprog(struct bpf_verifier_env *env, int idx) } i = next_insn; idx = sidx; + if (check_priv_stack) + subprog[idx].subtree_top_idx = orig_idx; - if (subprog[idx].has_tail_call) + if (!check_priv_stack && subprog[idx].has_tail_call) tail_call_reachable = true; frame++; @@ -6122,7 +6181,7 @@ static int check_max_stack_depth_subprog(struct bpf_verifier_env *env, int idx) } subprog[ret_prog[j]].tail_call_reachable = true; } - if (subprog[0].tail_call_reachable) + if (!check_priv_stack && subprog[0].tail_call_reachable) env->prog->aux->tail_call_reachable = true; /* end of for() loop means the last insn of the 'subprog' @@ -6137,14 +6196,18 @@ static int check_max_stack_depth_subprog(struct bpf_verifier_env *env, int idx) goto continue_func; } -static int check_max_stack_depth(struct bpf_verifier_env *env) +static int check_max_stack_depth(struct bpf_verifier_env *env, bool check_priv_stack, + bool priv_stack_supported) { struct bpf_subprog_info *si = env->subprog_info; + bool check_subprog; int ret; for (int i = 0; i < env->subprog_cnt; i++) { - if (!i || si[i].is_async_cb) { - ret = check_max_stack_depth_subprog(env, i); + check_subprog = !i || (check_priv_stack ? si[i].is_cb : si[i].is_async_cb); + if (check_subprog) { + ret = check_max_stack_depth_subprog(env, i, check_priv_stack, + priv_stack_supported); if (ret < 0) return ret; } @@ -22303,7 +22366,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3 struct bpf_verifier_env *env; int i, len, ret = -EINVAL, err; u32 log_true_size; - bool is_priv; + bool is_priv, priv_stack_supported = false; /* no program is valid */ if (ARRAY_SIZE(bpf_verifier_ops) == 0) @@ -22430,8 +22493,10 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3 if (ret == 0) ret = remove_fastcall_spills_fills(env); - if (ret == 0) - ret = check_max_stack_depth(env); + if (ret == 0) { + priv_stack_supported = is_priv_stack_supported(env); + ret = check_max_stack_depth(env, false, priv_stack_supported); + } /* instruction rewrites happen after this point */ if (ret == 0) @@ -22465,6 +22530,9 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3 : false; } + if (ret == 0 && priv_stack_supported) + ret = check_max_stack_depth(env, true, true); + if (ret == 0) ret = fixup_call_args(env); From patchwork Thu Oct 17 22:31:48 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yonghong Song X-Patchwork-Id: 13840894 X-Patchwork-Delegate: bpf@iogearbox.net Received: from 66-220-155-178.mail-mxout.facebook.com (66-220-155-178.mail-mxout.facebook.com [66.220.155.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6F54E1D5CF9 for ; Thu, 17 Oct 2024 22:34:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=66.220.155.178 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729204501; cv=none; b=c8WertTU4OITlqZGR+Wws8vf6GJLd/sgTO4mhd+s3sY26A1AXZUevV7E9bC8i+mJro2lK1knZcXjnY6vxsWm3+CqRKAAk8/H/Z9UgFAMMjU5/fgw4Gs4+9I5QYmgyDdbNhFN9bqlaJkvDG+yddau647VvLB10qIWGKiBLS27MNs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729204501; c=relaxed/simple; bh=VcnctrvCSyfUYO4bV0b6c6dSLokLMsgvou7rqVScEkc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=rJq4r/R7u+RGmCYobuthCP6k5ttD8JqkKAbyqY4Y5uYpM0G2hIe01AM2BPC0YqsMQw5KlvbOuens7AzPrfs4Gj3eB4IoP770++8xzGney3HqfWnC23azV/WlSkn6ZSGLokGcT5j5Kgc02ycO0bdtm96kMTmXVpM/vVeXGy9AK+4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.dev; spf=fail smtp.mailfrom=linux.dev; arc=none smtp.client-ip=66.220.155.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=linux.dev Received: by devbig309.ftw3.facebook.com (Postfix, from userid 128203) id EE9A6A2F07AE; Thu, 17 Oct 2024 15:31:48 -0700 (PDT) From: Yonghong Song To: bpf@vger.kernel.org Cc: Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , kernel-team@fb.com, Martin KaFai Lau , Tejun Heo Subject: [PATCH bpf-next v5 2/9] bpf: Support private stack for struct_ops programs Date: Thu, 17 Oct 2024 15:31:48 -0700 Message-ID: <20241017223148.3176403-1-yonghong.song@linux.dev> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20241017223138.3175885-1-yonghong.song@linux.dev> References: <20241017223138.3175885-1-yonghong.song@linux.dev> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net Add 'priv_stack_allowed()' callback function in bpf_verifier_ops. If the callback function returns true, the struct_ops are eligible to use private stack. Otherwise, normal kernel stack is used. Signed-off-by: Yonghong Song --- include/linux/bpf.h | 1 + kernel/bpf/verifier.c | 2 ++ 2 files changed, 3 insertions(+) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 6ad8ace7075a..a789cd2f5d6a 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -988,6 +988,7 @@ struct bpf_verifier_ops { int (*btf_struct_access)(struct bpf_verifier_log *log, const struct bpf_reg_state *reg, int off, int size); + bool (*priv_stack_allowed)(void); }; struct bpf_prog_offload_ops { diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index a12f5e823284..a14857015ad4 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -5995,6 +5995,8 @@ static bool bpf_enable_private_stack(struct bpf_verifier_env *env) case BPF_PROG_TYPE_PERF_EVENT: case BPF_PROG_TYPE_RAW_TRACEPOINT: return true; + case BPF_PROG_TYPE_STRUCT_OPS: + return env->ops->priv_stack_allowed && env->ops->priv_stack_allowed(); case BPF_PROG_TYPE_TRACING: if (env->prog->expected_attach_type != BPF_TRACE_ITER) return true; From patchwork Thu Oct 17 22:31:54 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yonghong Song X-Patchwork-Id: 13840893 X-Patchwork-Delegate: bpf@iogearbox.net Received: from 69-171-232-181.mail-mxout.facebook.com (69-171-232-181.mail-mxout.facebook.com [69.171.232.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5C0FD1D79A7 for ; Thu, 17 Oct 2024 22:34:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=69.171.232.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729204500; cv=none; b=fJSW2T+RW/8xuM7kFsmVOUsTu9J26LL5eki6WF3K4HCGsXBJh4FqDpC5L0NPnRLKNaaDlzq87fxQqIqD5KN8XqpK2RRIlrvCQAEmd98mjqeM+X2Ps9ilQ0w9etp2so/czfFdiALoTep5rsGWl3XC/4t9pp02kLs6Ht69Xn9p7aQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729204500; c=relaxed/simple; bh=fFyIp6W/9niIhq5mtAeHflCjRRSGawqx/UouEdV8bYA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ZjJpHjfu/Po1q7lNI94GMyKZ2lNaamIzAZtRTHtsPGMSwgvVOiISgNK3T2yhz1uEh1H9xLAFG0d5vPgoUvQHYXuc0R6igaQ96p57Unh712fT2YQAkcqYUm+LUBcRZNjdb6mdCw+8llzVlXsIPqqBIijQKzZte1AyVzsK5J27qeg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.dev; spf=fail smtp.mailfrom=linux.dev; arc=none smtp.client-ip=69.171.232.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=linux.dev Received: by devbig309.ftw3.facebook.com (Postfix, from userid 128203) id 23595A2F07D2; Thu, 17 Oct 2024 15:31:54 -0700 (PDT) From: Yonghong Song To: bpf@vger.kernel.org Cc: Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , kernel-team@fb.com, Martin KaFai Lau , Tejun Heo Subject: [PATCH bpf-next v5 3/9] sched-ext: Allow sched-ext progs to use private stack Date: Thu, 17 Oct 2024 15:31:54 -0700 Message-ID: <20241017223154.3176602-1-yonghong.song@linux.dev> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20241017223138.3175885-1-yonghong.song@linux.dev> References: <20241017223138.3175885-1-yonghong.song@linux.dev> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net Allow sched-ext struct_ops bpf progs to use private stack. In later jit, there will be some recursion checking if private stack is used such that if the same prog is run again in the same cpu, then that second prog run will be skipped. Signed-off-by: Yonghong Song --- kernel/sched/ext.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c index 3cd7c50a51c5..f186cf7cac90 100644 --- a/kernel/sched/ext.c +++ b/kernel/sched/ext.c @@ -5370,10 +5370,16 @@ bpf_scx_get_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) } } +static bool bpf_scx_priv_stack_allowed(void) +{ + return true; +} + static const struct bpf_verifier_ops bpf_scx_verifier_ops = { .get_func_proto = bpf_scx_get_func_proto, .is_valid_access = bpf_scx_is_valid_access, .btf_struct_access = bpf_scx_btf_struct_access, + .priv_stack_allowed = bpf_scx_priv_stack_allowed, }; static int bpf_scx_init_member(const struct btf_type *t, From patchwork Thu Oct 17 22:31:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yonghong Song X-Patchwork-Id: 13840887 X-Patchwork-Delegate: bpf@iogearbox.net Received: from 66-220-155-179.mail-mxout.facebook.com (66-220-155-179.mail-mxout.facebook.com [66.220.155.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E68C11C1AC4 for ; Thu, 17 Oct 2024 22:32:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=66.220.155.179 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729204336; cv=none; b=hmQCMj6EdywimIYRPYfvjfD9fEm2vQ6w0WOqat6z0BSlSqfjcs7eJNR9NKaWlrwMj0Kc1byxTmu/wKkywrQTsf8Hmka3VztmI9YNpA4mmUVtuo1fv/HiDaPI9+uVg/0LH+w2wLrbS7i5OTKTfhuUzEx17HNUOdM0wOSz0GCjqG4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729204336; c=relaxed/simple; bh=AVlJGCavZcBDeFTlfE68iKAkJp67pPuhJa9n1LH2ZDA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=MJdqTQjKd+hOeETj0HlEvSUOfGGcnZ77PwDSayib0SmGkhUfzlnV8SlFkGcdHgC1mCJzBINkLoty6V9AP1jZcxVy5FYNwE7NI4uexwjDsTYGMfRRr9WyciH9GLL2CfmnW+5MRIe7sIZ9DHb1IUcrDS9UGTO7SnARwd4C3tWOxHI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.dev; spf=fail smtp.mailfrom=linux.dev; arc=none smtp.client-ip=66.220.155.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=linux.dev Received: by devbig309.ftw3.facebook.com (Postfix, from userid 128203) id 3E091A2F07EF; Thu, 17 Oct 2024 15:31:59 -0700 (PDT) From: Yonghong Song To: bpf@vger.kernel.org Cc: Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , kernel-team@fb.com, Martin KaFai Lau , Tejun Heo Subject: [PATCH bpf-next v5 4/9] bpf: Mark each subprog with proper private stack modes Date: Thu, 17 Oct 2024 15:31:59 -0700 Message-ID: <20241017223159.3176904-1-yonghong.song@linux.dev> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20241017223138.3175885-1-yonghong.song@linux.dev> References: <20241017223138.3175885-1-yonghong.song@linux.dev> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net Three private stack modes are used to direct jit action: NO_PRIV_STACK: do not use private stack PRIV_STACK_SUB_PROG: adjust frame pointer address (similar to normal stack) PRIV_STACK_ROOT_PROG: set the frame pointer Note that for subtree root prog (main prog or callback fn), even if the bpf_prog stack size is 0, PRIV_STACK_ROOT_PROG mode is still used. This is for bpf exception handling. More details can be found in subsequent jit support and selftest patches. Signed-off-by: Yonghong Song --- include/linux/bpf.h | 9 +++++++++ kernel/bpf/core.c | 19 +++++++++++++++++++ kernel/bpf/verifier.c | 29 +++++++++++++++++++++++++++++ 3 files changed, 57 insertions(+) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index a789cd2f5d6a..2c07a2e311f4 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1457,6 +1457,12 @@ struct btf_mod_pair { struct bpf_kfunc_desc_tab; +enum bpf_priv_stack_mode { + NO_PRIV_STACK, + PRIV_STACK_SUB_PROG, + PRIV_STACK_ROOT_PROG, +}; + struct bpf_prog_aux { atomic64_t refcnt; u32 used_map_cnt; @@ -1473,6 +1479,9 @@ struct bpf_prog_aux { u32 ctx_arg_info_size; u32 max_rdonly_access; u32 max_rdwr_access; + enum bpf_priv_stack_mode priv_stack_mode; + u16 subtree_stack_depth; /* Subtree stack depth if PRIV_STACK_ROOT_PROG, 0 otherwise */ + void __percpu *priv_stack_ptr; struct btf *attach_btf; const struct bpf_ctx_arg_aux *ctx_arg_info; struct mutex dst_mutex; /* protects dst_* pointers below, *after* prog becomes visible */ diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index 14d9288441f2..aee0055def4f 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -1240,6 +1240,7 @@ void __weak bpf_jit_free(struct bpf_prog *fp) struct bpf_binary_header *hdr = bpf_jit_binary_hdr(fp); bpf_jit_binary_free(hdr); + free_percpu(fp->aux->priv_stack_ptr); WARN_ON_ONCE(!bpf_prog_kallsyms_verify_off(fp)); } @@ -2421,6 +2422,24 @@ struct bpf_prog *bpf_prog_select_runtime(struct bpf_prog *fp, int *err) if (*err) return fp; + if (fp->aux->priv_stack_eligible) { + if (!fp->aux->stack_depth) { + fp->aux->priv_stack_mode = NO_PRIV_STACK; + } else { + void __percpu *priv_stack_ptr; + + fp->aux->priv_stack_mode = PRIV_STACK_ROOT_PROG; + priv_stack_ptr = + __alloc_percpu_gfp(fp->aux->stack_depth, 8, GFP_KERNEL); + if (!priv_stack_ptr) { + *err = -ENOMEM; + return fp; + } + fp->aux->subtree_stack_depth = fp->aux->stack_depth; + fp->aux->priv_stack_ptr = priv_stack_ptr; + } + } + fp = bpf_int_jit_compile(fp); bpf_prog_jit_attempt_done(fp); if (!fp->jited && jit_needed) { diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index a14857015ad4..274b0b92177d 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -20010,6 +20010,8 @@ static int jit_subprogs(struct bpf_verifier_env *env) { struct bpf_prog *prog = env->prog, **func, *tmp; int i, j, subprog_start, subprog_end = 0, len, subprog; + int subtree_top_idx, subtree_stack_depth; + void __percpu *priv_stack_ptr; struct bpf_map *map_ptr; struct bpf_insn *insn; void *old_bpf_func; @@ -20088,6 +20090,33 @@ static int jit_subprogs(struct bpf_verifier_env *env) func[i]->is_func = 1; func[i]->sleepable = prog->sleepable; func[i]->aux->func_idx = i; + + subtree_top_idx = env->subprog_info[i].subtree_top_idx; + if (env->subprog_info[subtree_top_idx].priv_stack_eligible) { + if (subtree_top_idx == i) + func[i]->aux->subtree_stack_depth = + env->subprog_info[i].subtree_stack_depth; + + subtree_stack_depth = func[i]->aux->subtree_stack_depth; + if (subtree_top_idx != i) { + if (env->subprog_info[subtree_top_idx].subtree_stack_depth) + func[i]->aux->priv_stack_mode = PRIV_STACK_SUB_PROG; + else + func[i]->aux->priv_stack_mode = NO_PRIV_STACK; + } else if (!subtree_stack_depth) { + func[i]->aux->priv_stack_mode = PRIV_STACK_ROOT_PROG; + } else { + func[i]->aux->priv_stack_mode = PRIV_STACK_ROOT_PROG; + priv_stack_ptr = + __alloc_percpu_gfp(subtree_stack_depth, 8, GFP_KERNEL); + if (!priv_stack_ptr) { + err = -ENOMEM; + goto out_free; + } + func[i]->aux->priv_stack_ptr = priv_stack_ptr; + } + } + /* Below members will be freed only at prog->aux */ func[i]->aux->btf = prog->aux->btf; func[i]->aux->func_info = prog->aux->func_info; From patchwork Thu Oct 17 22:32:04 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yonghong Song X-Patchwork-Id: 13840888 X-Patchwork-Delegate: bpf@iogearbox.net Received: from 69-171-232-180.mail-mxout.facebook.com (69-171-232-180.mail-mxout.facebook.com [69.171.232.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2925D1D95A9 for ; Thu, 17 Oct 2024 22:32:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=69.171.232.180 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729204342; cv=none; b=fxAunOly08Y6+N6BzyO5llLfq2lEhm0iHvfUhB1KhFkE+zcH4Fdns1eyF7QwCwzopkwnw5D+HcejJuJGXk/tkO5D2yIMmVn4sjUu3P88AXmSkZpRgICJQbXOo6b3isdf9cgVLAU1Ejz+kYjRu6Vo1Ez7bLEHG0a83x0k6LixO/c= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729204342; c=relaxed/simple; bh=FwN/GS/aq27f3iSl15GyQ4terJAIdlhLTiTAe/mCah8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=KDt9h7RKUMkucX6W2tqFVSH+kAcRfNW6W92P0+UfV2f3JbN9cbT2KOMN38QlttZFpW86CIVpCPr/kFOqkRLF/Vev3HuKwr9+DBEZvxicIKVnfXDR0K0AL/9B6n48b43plCBTVlww+BvylPPJZnCciNHf4RFIms8dJ6INmNAdkvc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.dev; spf=fail smtp.mailfrom=linux.dev; arc=none smtp.client-ip=69.171.232.180 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=linux.dev Received: by devbig309.ftw3.facebook.com (Postfix, from userid 128203) id 582FEA2F0820; Thu, 17 Oct 2024 15:32:04 -0700 (PDT) From: Yonghong Song To: bpf@vger.kernel.org Cc: Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , kernel-team@fb.com, Martin KaFai Lau , Tejun Heo Subject: [PATCH bpf-next v5 5/9] bpf, x86: Refactor func emit_prologue Date: Thu, 17 Oct 2024 15:32:04 -0700 Message-ID: <20241017223204.3177432-1-yonghong.song@linux.dev> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20241017223138.3175885-1-yonghong.song@linux.dev> References: <20241017223138.3175885-1-yonghong.song@linux.dev> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net Refactor function emit_prologue() such that it has bpf_prog as one of arguments. This can reduce the number of total arguments since later on there will be more arguments being added to this function. Also add a variable 'stack_depth' to hold the value for bpf_prog->aux->stack_depth to simplify the code. Signed-off-by: Yonghong Song --- arch/x86/net/bpf_jit_comp.c | 21 ++++++++++++--------- 1 file changed, 12 insertions(+), 9 deletions(-) diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c index 06b080b61aa5..6d24389e58a1 100644 --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -489,10 +489,12 @@ static void emit_prologue_tail_call(u8 **pprog, bool is_subprog) * bpf_tail_call helper will skip the first X86_TAIL_CALL_OFFSET bytes * while jumping to another program */ -static void emit_prologue(u8 **pprog, u32 stack_depth, bool ebpf_from_cbpf, - bool tail_call_reachable, bool is_subprog, - bool is_exception_cb) +static void emit_prologue(u8 **pprog, u32 stack_depth, struct bpf_prog *bpf_prog, + bool tail_call_reachable) { + bool ebpf_from_cbpf = bpf_prog_was_classic(bpf_prog); + bool is_exception_cb = bpf_prog->aux->exception_cb; + bool is_subprog = bpf_is_subprog(bpf_prog); u8 *prog = *pprog; emit_cfi(&prog, is_subprog ? cfi_bpf_subprog_hash : cfi_bpf_hash); @@ -1424,17 +1426,18 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 *rw_image u64 arena_vm_start, user_vm_start; int i, excnt = 0; int ilen, proglen = 0; + u32 stack_depth; u8 *prog = temp; int err; + stack_depth = bpf_prog->aux->stack_depth; + arena_vm_start = bpf_arena_get_kern_vm_start(bpf_prog->aux->arena); user_vm_start = bpf_arena_get_user_vm_start(bpf_prog->aux->arena); detect_reg_usage(insn, insn_cnt, callee_regs_used); - emit_prologue(&prog, bpf_prog->aux->stack_depth, - bpf_prog_was_classic(bpf_prog), tail_call_reachable, - bpf_is_subprog(bpf_prog), bpf_prog->aux->exception_cb); + emit_prologue(&prog, stack_depth, bpf_prog, tail_call_reachable); /* Exception callback will clobber callee regs for its own use, and * restore the original callee regs from main prog's stack frame. */ @@ -2128,7 +2131,7 @@ st: if (is_imm8(insn->off)) func = (u8 *) __bpf_call_base + imm32; if (tail_call_reachable) { - LOAD_TAIL_CALL_CNT_PTR(bpf_prog->aux->stack_depth); + LOAD_TAIL_CALL_CNT_PTR(stack_depth); ip += 7; } if (!imm32) @@ -2145,13 +2148,13 @@ st: if (is_imm8(insn->off)) &bpf_prog->aux->poke_tab[imm32 - 1], &prog, image + addrs[i - 1], callee_regs_used, - bpf_prog->aux->stack_depth, + stack_depth, ctx); else emit_bpf_tail_call_indirect(bpf_prog, &prog, callee_regs_used, - bpf_prog->aux->stack_depth, + stack_depth, image + addrs[i - 1], ctx); break; From patchwork Thu Oct 17 22:32:09 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yonghong Song X-Patchwork-Id: 13840889 X-Patchwork-Delegate: bpf@iogearbox.net Received: from 66-220-155-178.mail-mxout.facebook.com (66-220-155-178.mail-mxout.facebook.com [66.220.155.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2EEFF1D86CB for ; Thu, 17 Oct 2024 22:32:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=66.220.155.178 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729204348; cv=none; b=JLF9ezSsNBIHGtFjkazBTgD0fYDpEHwHzhjjhJfrMVDMeKxS148eW0Cwfre5uO8FgiBvp4UX7U42rKOgpXaCVr5QCr+l5OZ6zUhkfXKopFFT1cETMIiqgjk7zNtF52AjxE+0TYM+yLFC6tlXBZjtevn1QDKdzLtb7oUYsagD0QA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729204348; c=relaxed/simple; bh=GpSmPbQXUavgPZYJFod3hh5zOpeJZo+P04PkrivsCME=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=gitOzUBQcfFyPWUd98WU2qKvBuXBlmQSoQ8tHMCtFkuKkpEllvRHhc9/yk1CG080J54fEVzPII+DStorJ8WGNZssZ1sHwbE2PxzCVF4z+ZzkVYw2adfS+SOUr3oi1dLq9kFfa5lOok0oD+tajEf35Bkq64vw5yVncab0L9Xu04s= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.dev; spf=fail smtp.mailfrom=linux.dev; arc=none smtp.client-ip=66.220.155.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=linux.dev Received: by devbig309.ftw3.facebook.com (Postfix, from userid 128203) id 74898A2F0835; Thu, 17 Oct 2024 15:32:09 -0700 (PDT) From: Yonghong Song To: bpf@vger.kernel.org Cc: Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , kernel-team@fb.com, Martin KaFai Lau , Tejun Heo Subject: [PATCH bpf-next v5 6/9] bpf, x86: Create a helper for certain "reg = imm" operations Date: Thu, 17 Oct 2024 15:32:09 -0700 Message-ID: <20241017223209.3177719-1-yonghong.song@linux.dev> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20241017223138.3175885-1-yonghong.song@linux.dev> References: <20241017223138.3175885-1-yonghong.song@linux.dev> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net Create a helper to generate jited codes for certain "reg = imm" operations where operations are for add/sub/and/or/xor. This helper will be used in the subsequent patch. Signed-off-by: Yonghong Song --- arch/x86/net/bpf_jit_comp.c | 82 +++++++++++++++++++++---------------- 1 file changed, 46 insertions(+), 36 deletions(-) diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c index 6d24389e58a1..6be8c739c3c2 100644 --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -1406,6 +1406,51 @@ static void emit_shiftx(u8 **pprog, u32 dst_reg, u8 src_reg, bool is64, u8 op) *pprog = prog; } +/* emit ADD/SUB/AND/OR/XOR 'reg = imm' operations */ +static void emit_alu_imm(u8 **pprog, u8 insn_code, u32 dst_reg, s32 imm32) +{ + u8 b2 = 0, b3 = 0; + u8 *prog = *pprog; + + maybe_emit_1mod(&prog, dst_reg, BPF_CLASS(insn_code) == BPF_ALU64); + + /* + * b3 holds 'normal' opcode, b2 short form only valid + * in case dst is eax/rax. + */ + switch (BPF_OP(insn_code)) { + case BPF_ADD: + b3 = 0xC0; + b2 = 0x05; + break; + case BPF_SUB: + b3 = 0xE8; + b2 = 0x2D; + break; + case BPF_AND: + b3 = 0xE0; + b2 = 0x25; + break; + case BPF_OR: + b3 = 0xC8; + b2 = 0x0D; + break; + case BPF_XOR: + b3 = 0xF0; + b2 = 0x35; + break; + } + + if (is_imm8(imm32)) + EMIT3(0x83, add_1reg(b3, dst_reg), imm32); + else if (is_axreg(dst_reg)) + EMIT1_off32(b2, imm32); + else + EMIT2_off32(0x81, add_1reg(b3, dst_reg), imm32); + + *pprog = prog; +} + #define INSN_SZ_DIFF (((addrs[i] - addrs[i - 1]) - (prog - temp))) #define __LOAD_TCC_PTR(off) \ @@ -1567,42 +1612,7 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 *rw_image case BPF_ALU64 | BPF_AND | BPF_K: case BPF_ALU64 | BPF_OR | BPF_K: case BPF_ALU64 | BPF_XOR | BPF_K: - maybe_emit_1mod(&prog, dst_reg, - BPF_CLASS(insn->code) == BPF_ALU64); - - /* - * b3 holds 'normal' opcode, b2 short form only valid - * in case dst is eax/rax. - */ - switch (BPF_OP(insn->code)) { - case BPF_ADD: - b3 = 0xC0; - b2 = 0x05; - break; - case BPF_SUB: - b3 = 0xE8; - b2 = 0x2D; - break; - case BPF_AND: - b3 = 0xE0; - b2 = 0x25; - break; - case BPF_OR: - b3 = 0xC8; - b2 = 0x0D; - break; - case BPF_XOR: - b3 = 0xF0; - b2 = 0x35; - break; - } - - if (is_imm8(imm32)) - EMIT3(0x83, add_1reg(b3, dst_reg), imm32); - else if (is_axreg(dst_reg)) - EMIT1_off32(b2, imm32); - else - EMIT2_off32(0x81, add_1reg(b3, dst_reg), imm32); + emit_alu_imm(&prog, insn->code, dst_reg, imm32); break; case BPF_ALU64 | BPF_MOV | BPF_K: From patchwork Thu Oct 17 22:32:14 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yonghong Song X-Patchwork-Id: 13840890 X-Patchwork-Delegate: bpf@iogearbox.net Received: from 69-171-232-180.mail-mxout.facebook.com (69-171-232-180.mail-mxout.facebook.com [69.171.232.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0B7FC1DFDAC for ; Thu, 17 Oct 2024 22:32:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=69.171.232.180 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729204353; cv=none; b=ZuqsSANoquYgx96W+7IjEY3Vvnu/SPTQt/Cu8RoklHYEsCKZ6LTI1PhT3pUn0XzFuYxemxXxIk0x1eY10hwXGNeVE1jQfsy2EhsBiYRd/ZlqbVzkhCzWpZCC9YVEY1kEzdkmwowZTIPDKhGw0yq4aPyJlVG5gTCZu3pDu2mFyiY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729204353; c=relaxed/simple; bh=wj5snQFJAGFWH0fRKH583rrjJ7BPJ0nCyENiyobSRm0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=UxkckkUicjtmdPmY5VX+Jbhk95Vni2XmKbHjyAkfqH4sfN+wLy89Rstitesqdo1FqAKmMq0o29rYedhO1qhuEBfWeBKZ0rCwgAAqDPZmY6dqMr2q+IRzO9e9Whpx2q/PHRXcqUK4/DWpf4kllcHS4rGEo7UYpXYzMNtaFk7Cdgs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.dev; spf=fail smtp.mailfrom=linux.dev; arc=none smtp.client-ip=69.171.232.180 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=linux.dev Received: by devbig309.ftw3.facebook.com (Postfix, from userid 128203) id 90389A2F0855; Thu, 17 Oct 2024 15:32:14 -0700 (PDT) From: Yonghong Song To: bpf@vger.kernel.org Cc: Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , kernel-team@fb.com, Martin KaFai Lau , Tejun Heo Subject: [PATCH bpf-next v5 7/9] bpf, x86: Add jit support for private stack Date: Thu, 17 Oct 2024 15:32:14 -0700 Message-ID: <20241017223214.3177977-1-yonghong.song@linux.dev> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20241017223138.3175885-1-yonghong.song@linux.dev> References: <20241017223138.3175885-1-yonghong.song@linux.dev> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net Add jit support for private stack. For a particular subtree, e.g., subtree_root <== stack depth 120 subprog1 <== stack depth 80 subprog2 <== stack depth 40 subprog3 <== stack depth 160 Let us say that priv_stack_ptr is the memory address allocated for private stack. The frame pointer for each above is calculated like below: subtree_root <== subtree_root_fp = private_stack_ptr + 120 subprog1 <== subtree_subprog1_fp = subtree_root_fp + 80 subprog2 <== subtree_subprog2_fp = subtree_subprog1_fp + 40 subprog3 <== subtree_subprog1_fp = subtree_root_fp + 160 For any function call to helper/kfunc, push/pop prog frame pointer is needed in order to preserve frame pointer value. To deal with exception handling, push/pop frame pointer is also used surrounding call to subsequent subprog. For example, subtree_root subprog1 ... insn: call bpf_throw ... After jit, we will have subtree_root insn: push r9 subprog1 ... insn: push r9 insn: call bpf_throw insn: pop r9 ... insn: pop r9 exception_handler pop r9 ... where r9 represents the fp for each subprog. Signed-off-by: Yonghong Song --- arch/x86/net/bpf_jit_comp.c | 88 +++++++++++++++++++++++++++++++++++- include/linux/bpf_verifier.h | 1 + 2 files changed, 87 insertions(+), 2 deletions(-) diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c index 6be8c739c3c2..86ebca32befc 100644 --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -325,6 +325,22 @@ struct jit_context { /* Number of bytes that will be skipped on tailcall */ #define X86_TAIL_CALL_OFFSET (12 + ENDBR_INSN_SIZE) +static void push_r9(u8 **pprog) +{ + u8 *prog = *pprog; + + EMIT2(0x41, 0x51); /* push r9 */ + *pprog = prog; +} + +static void pop_r9(u8 **pprog) +{ + u8 *prog = *pprog; + + EMIT2(0x41, 0x59); /* pop r9 */ + *pprog = prog; +} + static void push_r12(u8 **pprog) { u8 *prog = *pprog; @@ -484,13 +500,17 @@ static void emit_prologue_tail_call(u8 **pprog, bool is_subprog) *pprog = prog; } +static void emit_priv_frame_ptr(u8 **pprog, struct bpf_prog *bpf_prog, + enum bpf_priv_stack_mode priv_stack_mode); + /* * Emit x86-64 prologue code for BPF program. * bpf_tail_call helper will skip the first X86_TAIL_CALL_OFFSET bytes * while jumping to another program */ static void emit_prologue(u8 **pprog, u32 stack_depth, struct bpf_prog *bpf_prog, - bool tail_call_reachable) + bool tail_call_reachable, + enum bpf_priv_stack_mode priv_stack_mode) { bool ebpf_from_cbpf = bpf_prog_was_classic(bpf_prog); bool is_exception_cb = bpf_prog->aux->exception_cb; @@ -520,6 +540,8 @@ static void emit_prologue(u8 **pprog, u32 stack_depth, struct bpf_prog *bpf_prog * first restore those callee-saved regs from stack, before * reusing the stack frame. */ + if (priv_stack_mode != NO_PRIV_STACK) + pop_r9(&prog); pop_callee_regs(&prog, all_callee_regs_used); pop_r12(&prog); /* Reset the stack frame. */ @@ -532,6 +554,8 @@ static void emit_prologue(u8 **pprog, u32 stack_depth, struct bpf_prog *bpf_prog /* X86_TAIL_CALL_OFFSET is here */ EMIT_ENDBR(); + emit_priv_frame_ptr(&prog, bpf_prog, priv_stack_mode); + /* sub rsp, rounded_stack_depth */ if (stack_depth) EMIT3_off32(0x48, 0x81, 0xEC, round_up(stack_depth, 8)); @@ -1451,6 +1475,42 @@ static void emit_alu_imm(u8 **pprog, u8 insn_code, u32 dst_reg, s32 imm32) *pprog = prog; } +static void emit_root_priv_frame_ptr(u8 **pprog, struct bpf_prog *bpf_prog, + u32 orig_stack_depth) +{ + void __percpu *priv_frame_ptr; + u8 *prog = *pprog; + + priv_frame_ptr = bpf_prog->aux->priv_stack_ptr + orig_stack_depth; + + /* movabs r9, priv_frame_ptr */ + emit_mov_imm64(&prog, X86_REG_R9, (long) priv_frame_ptr >> 32, + (u32) (long) priv_frame_ptr); +#ifdef CONFIG_SMP + /* add , gs:[] */ + EMIT2(0x65, 0x4c); + EMIT3(0x03, 0x0c, 0x25); + EMIT((u32)(unsigned long)&this_cpu_off, 4); +#endif + *pprog = prog; +} + +static void emit_priv_frame_ptr(u8 **pprog, struct bpf_prog *bpf_prog, + enum bpf_priv_stack_mode priv_stack_mode) +{ + u32 orig_stack_depth = round_up(bpf_prog->aux->stack_depth, 8); + u8 *prog = *pprog; + + if (priv_stack_mode == PRIV_STACK_ROOT_PROG) + emit_root_priv_frame_ptr(&prog, bpf_prog, orig_stack_depth); + else if (priv_stack_mode == PRIV_STACK_SUB_PROG && orig_stack_depth) + /* r9 += orig_stack_depth */ + emit_alu_imm(&prog, BPF_ALU64 | BPF_ADD | BPF_K, X86_REG_R9, + orig_stack_depth); + + *pprog = prog; +} + #define INSN_SZ_DIFF (((addrs[i] - addrs[i - 1]) - (prog - temp))) #define __LOAD_TCC_PTR(off) \ @@ -1464,6 +1524,7 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 *rw_image { bool tail_call_reachable = bpf_prog->aux->tail_call_reachable; struct bpf_insn *insn = bpf_prog->insnsi; + enum bpf_priv_stack_mode priv_stack_mode; bool callee_regs_used[4] = {}; int insn_cnt = bpf_prog->len; bool seen_exit = false; @@ -1476,13 +1537,17 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 *rw_image int err; stack_depth = bpf_prog->aux->stack_depth; + priv_stack_mode = bpf_prog->aux->priv_stack_mode; + if (priv_stack_mode != NO_PRIV_STACK) + stack_depth = 0; arena_vm_start = bpf_arena_get_kern_vm_start(bpf_prog->aux->arena); user_vm_start = bpf_arena_get_user_vm_start(bpf_prog->aux->arena); detect_reg_usage(insn, insn_cnt, callee_regs_used); - emit_prologue(&prog, stack_depth, bpf_prog, tail_call_reachable); + emit_prologue(&prog, stack_depth, bpf_prog, tail_call_reachable, + priv_stack_mode); /* Exception callback will clobber callee regs for its own use, and * restore the original callee regs from main prog's stack frame. */ @@ -1521,6 +1586,14 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 *rw_image u8 *func; int nops; + if (priv_stack_mode != NO_PRIV_STACK) { + if (src_reg == BPF_REG_FP) + src_reg = X86_REG_R9; + + if (dst_reg == BPF_REG_FP) + dst_reg = X86_REG_R9; + } + switch (insn->code) { /* ALU */ case BPF_ALU | BPF_ADD | BPF_X: @@ -2146,9 +2219,15 @@ st: if (is_imm8(insn->off)) } if (!imm32) return -EINVAL; + if (priv_stack_mode != NO_PRIV_STACK) { + push_r9(&prog); + ip += 2; + } ip += x86_call_depth_emit_accounting(&prog, func, ip); if (emit_call(&prog, func, ip)) return -EINVAL; + if (priv_stack_mode != NO_PRIV_STACK) + pop_r9(&prog); break; } @@ -3572,6 +3651,11 @@ bool bpf_jit_supports_exceptions(void) return IS_ENABLED(CONFIG_UNWINDER_ORC); } +bool bpf_jit_supports_private_stack(void) +{ + return true; +} + void arch_bpf_stack_walk(bool (*consume_fn)(void *cookie, u64 ip, u64 sp, u64 bp), void *cookie) { #if defined(CONFIG_UNWINDER_ORC) diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h index bcfe868e3801..dd28b05bcff0 100644 --- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -891,6 +891,7 @@ static inline bool bpf_prog_check_recur(const struct bpf_prog *prog) case BPF_PROG_TYPE_TRACING: return prog->expected_attach_type != BPF_TRACE_ITER; case BPF_PROG_TYPE_STRUCT_OPS: + return prog->aux->priv_stack_eligible; case BPF_PROG_TYPE_LSM: return false; default: From patchwork Thu Oct 17 22:32:19 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yonghong Song X-Patchwork-Id: 13840892 X-Patchwork-Delegate: bpf@iogearbox.net Received: from 69-171-232-180.mail-mxout.facebook.com (69-171-232-180.mail-mxout.facebook.com [69.171.232.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0D60F1D7E54 for ; Thu, 17 Oct 2024 22:32:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=69.171.232.180 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729204357; cv=none; b=B4EsJMDtK6OkQdsJ0Qqtv+4205tIvampJK996m8l4QKwxCTv3Lt1s8bYb485tqAWIYKhkTO6U6mgSM5BXTCdA76NEha2jhlF+DRVki+WckSALV4+eWn0OCGeJsl/A4rnWxDeZMZktnwO+Ogs/Ebt26GGBTPv4zfR/03r+1k++H8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729204357; c=relaxed/simple; bh=PVex4oQGYUtDO9nwOQiGRd76tefilsibaOCaDd1I45w=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=RoLmjjEu14FmCB/JLJFExm1wcuwlzLSVODlphIpInush8E1qOLpHVM8clqbADW93LmLBlQF/9ZAQscU68OFn4U01rJIwEUye9wPSFlAkHUm630NssCDGOKzJuFNEb/SGulfHWatpF3Gg7rb9o/bPREIA43UNW8cF3iYUsXY1NMY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.dev; spf=fail smtp.mailfrom=linux.dev; arc=none smtp.client-ip=69.171.232.180 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=linux.dev Received: by devbig309.ftw3.facebook.com (Postfix, from userid 128203) id B51B5A2F086C; Thu, 17 Oct 2024 15:32:19 -0700 (PDT) From: Yonghong Song To: bpf@vger.kernel.org Cc: Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , kernel-team@fb.com, Martin KaFai Lau , Tejun Heo Subject: [PATCH bpf-next v5 8/9] selftests/bpf: Add tracing prog private stack tests Date: Thu, 17 Oct 2024 15:32:19 -0700 Message-ID: <20241017223219.3178522-1-yonghong.song@linux.dev> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20241017223138.3175885-1-yonghong.song@linux.dev> References: <20241017223138.3175885-1-yonghong.song@linux.dev> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net Some private stack tests are added including: - prog with stack size greater than BPF_PSTACK_MIN_SUBTREE_SIZE. - prog with stack size less than BPF_PSTACK_MIN_SUBTREE_SIZE. - prog with one subprog having MAX_BPF_STACK stack size and another subprog having non-zero stack size. - prog with callback function. - prog with exception in main prog or subprog. Signed-off-by: Yonghong Song --- .../selftests/bpf/prog_tests/verifier.c | 2 + .../bpf/progs/verifier_private_stack.c | 216 ++++++++++++++++++ 2 files changed, 218 insertions(+) create mode 100644 tools/testing/selftests/bpf/progs/verifier_private_stack.c diff --git a/tools/testing/selftests/bpf/prog_tests/verifier.c b/tools/testing/selftests/bpf/prog_tests/verifier.c index e26b5150fc43..635ff3509403 100644 --- a/tools/testing/selftests/bpf/prog_tests/verifier.c +++ b/tools/testing/selftests/bpf/prog_tests/verifier.c @@ -59,6 +59,7 @@ #include "verifier_or_jmp32_k.skel.h" #include "verifier_precision.skel.h" #include "verifier_prevent_map_lookup.skel.h" +#include "verifier_private_stack.skel.h" #include "verifier_raw_stack.skel.h" #include "verifier_raw_tp_writable.skel.h" #include "verifier_reg_equal.skel.h" @@ -185,6 +186,7 @@ void test_verifier_bpf_fastcall(void) { RUN(verifier_bpf_fastcall); } void test_verifier_or_jmp32_k(void) { RUN(verifier_or_jmp32_k); } void test_verifier_precision(void) { RUN(verifier_precision); } void test_verifier_prevent_map_lookup(void) { RUN(verifier_prevent_map_lookup); } +void test_verifier_private_stack(void) { RUN(verifier_private_stack); } void test_verifier_raw_stack(void) { RUN(verifier_raw_stack); } void test_verifier_raw_tp_writable(void) { RUN(verifier_raw_tp_writable); } void test_verifier_reg_equal(void) { RUN(verifier_reg_equal); } diff --git a/tools/testing/selftests/bpf/progs/verifier_private_stack.c b/tools/testing/selftests/bpf/progs/verifier_private_stack.c new file mode 100644 index 000000000000..e8de565f8b34 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/verifier_private_stack.c @@ -0,0 +1,216 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include +#include "bpf_misc.h" +#include "bpf_experimental.h" + +/* From include/linux/filter.h */ +#define MAX_BPF_STACK 512 + +#if defined(__TARGET_ARCH_x86) + +SEC("kprobe") +__description("Private stack, single prog") +__success +__arch_x86_64 +__jited(" movabsq $0x{{.*}}, %r9") +__jited(" addq %gs:0x{{.*}}, %r9") +__jited(" movl $0x2a, %edi") +__jited(" movq %rdi, -0x100(%r9)") +__naked void private_stack_single_prog(void) +{ + asm volatile ( + "r1 = 42;" + "*(u64 *)(r10 - 256) = r1;" + "r0 = 0;" + "exit;" + : + : + : __clobber_all); +} + +__used +__naked static void cumulative_stack_depth_subprog(void) +{ + asm volatile ( + "r1 = 41;" + "*(u64 *)(r10 - 32) = r1;" + "call %[bpf_get_smp_processor_id];" + "exit;" + :: __imm(bpf_get_smp_processor_id) + : __clobber_all); +} + +SEC("kprobe") +__description("Private stack, subtree > MAX_BPF_STACK") +__success +__arch_x86_64 +/* private stack fp for the main prog */ +__jited(" movabsq $0x{{.*}}, %r9") +__jited(" addq %gs:0x{{.*}}, %r9") +__jited(" movl $0x2a, %edi") +__jited(" movq %rdi, -0x200(%r9)") +__jited(" pushq %r9") +__jited(" callq 0x{{.*}}") +__jited(" popq %r9") +__jited(" xorl %eax, %eax") +__naked void private_stack_nested_1(void) +{ + asm volatile ( + "r1 = 42;" + "*(u64 *)(r10 - %[max_bpf_stack]) = r1;" + "call cumulative_stack_depth_subprog;" + "r0 = 0;" + "exit;" + : + : __imm_const(max_bpf_stack, MAX_BPF_STACK) + : __clobber_all); +} + +SEC("kprobe") +__description("Private stack, subtree > MAX_BPF_STACK") +__success +__arch_x86_64 +/* private stack fp for the subprog */ +__jited(" addq $0x20, %r9") +__naked void private_stack_nested_2(void) +{ + asm volatile ( + "r1 = 42;" + "*(u64 *)(r10 - %[max_bpf_stack]) = r1;" + "call cumulative_stack_depth_subprog;" + "r0 = 0;" + "exit;" + : + : __imm_const(max_bpf_stack, MAX_BPF_STACK) + : __clobber_all); +} + +SEC("raw_tp") +__description("No private stack, nested") +__success +__arch_x86_64 +__jited(" subq $0x8, %rsp") +__naked void no_private_stack_nested(void) +{ + asm volatile ( + "r1 = 42;" + "*(u64 *)(r10 - 8) = r1;" + "call cumulative_stack_depth_subprog;" + "r0 = 0;" + "exit;" + : + : + : __clobber_all); +} + +__naked __noinline __used +static unsigned long loop_callback(void) +{ + asm volatile ( + "call %[bpf_get_prandom_u32];" + "r1 = 42;" + "*(u64 *)(r10 - 512) = r1;" + "call cumulative_stack_depth_subprog;" + "r0 = 0;" + "exit;" + : + : __imm(bpf_get_prandom_u32) + : __clobber_common); +} + +SEC("raw_tp") +__description("Private stack, callback") +__success +__arch_x86_64 +/* for func loop_callback */ +__jited("func #1") +__jited(" endbr64") +__jited(" nopl (%rax,%rax)") +__jited(" nopl (%rax)") +__jited(" pushq %rbp") +__jited(" movq %rsp, %rbp") +__jited(" endbr64") +__jited(" movabsq $0x{{.*}}, %r9") +__jited(" addq %gs:0x{{.*}}, %r9") +__jited(" pushq %r9") +__jited(" callq") +__jited(" popq %r9") +__jited(" movl $0x2a, %edi") +__jited(" movq %rdi, -0x200(%r9)") +__jited(" pushq %r9") +__jited(" callq") +__jited(" popq %r9") +__naked void private_stack_callback(void) +{ + asm volatile ( + "r1 = 1;" + "r2 = %[loop_callback];" + "r3 = 0;" + "r4 = 0;" + "call %[bpf_loop];" + "r0 = 0;" + "exit;" + : + : __imm_ptr(loop_callback), + __imm(bpf_loop) + : __clobber_common); +} + +SEC("fentry/bpf_fentry_test9") +__description("Private stack, exception in main prog") +__success __retval(0) +__arch_x86_64 +__jited(" pushq %r9") +__jited(" callq") +__jited(" popq %r9") +int private_stack_exception_main_prog(void) +{ + asm volatile ( + "r1 = 42;" + "*(u64 *)(r10 - 512) = r1;" + ::: __clobber_common); + + bpf_throw(0); + return 0; +} + +__used static int subprog_exception(void) +{ + bpf_throw(0); + return 0; +} + +SEC("fentry/bpf_fentry_test9") +__description("Private stack, exception in subprog") +__success __retval(0) +__arch_x86_64 +__jited(" movq %rdi, -0x200(%r9)") +__jited(" pushq %r9") +__jited(" callq") +__jited(" popq %r9") +int private_stack_exception_sub_prog(void) +{ + asm volatile ( + "r1 = 42;" + "*(u64 *)(r10 - 512) = r1;" + "call subprog_exception;" + ::: __clobber_common); + + return 0; +} + +#else + +SEC("kprobe") +__description("private stack is not supported, use a dummy test") +__success +int dummy_test(void) +{ + return 0; +} + +#endif + +char _license[] SEC("license") = "GPL"; From patchwork Thu Oct 17 22:32:24 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yonghong Song X-Patchwork-Id: 13840891 X-Patchwork-Delegate: bpf@iogearbox.net Received: from 66-220-155-179.mail-mxout.facebook.com (66-220-155-179.mail-mxout.facebook.com [66.220.155.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4F69C1D7E50 for ; Thu, 17 Oct 2024 22:32:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=66.220.155.179 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729204353; cv=none; b=X95okJvmerCJ82+gs1h2qnT6n7U0e7OT/6DEU4L9ol+zbBDZGKNoxmo7kjBv2Y2cFd+rfWbJXg7TxGb8f+4mN3r5sP0MITSGyu1mdYq0Lq6v4xLj75o3pG9bBdNCHIMSJBXU3wPohnNGBLGMTvKy6Xz7h8WG1fEDJBJBWLnyInI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729204353; c=relaxed/simple; bh=UmlFi/SJ5DVUbNNtWWXSinFH7DOMlMCzVumhXPMmhgk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=AbmD9Qvs89DZv+B9/PoBZLp87c1Ws7JfYu1KtUvESZmFDCyQedNa1k/Gm2F45yPxdXZGSwlnf0CU62IiJRKRnqdAIwut+EEp4W5c6PXn3G1qHclxoBcumV1O7LOwGvSuq/H50MMm45iBSp/eiMR0yLnpWuTuEGhR9a0Gt2iqgB0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.dev; spf=fail smtp.mailfrom=linux.dev; arc=none smtp.client-ip=66.220.155.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=linux.dev Received: by devbig309.ftw3.facebook.com (Postfix, from userid 128203) id CF81CA2F088C; Thu, 17 Oct 2024 15:32:24 -0700 (PDT) From: Yonghong Song To: bpf@vger.kernel.org Cc: Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , kernel-team@fb.com, Martin KaFai Lau , Tejun Heo Subject: [PATCH bpf-next v5 9/9] selftests/bpf: Add struct_ops prog private stack tests Date: Thu, 17 Oct 2024 15:32:24 -0700 Message-ID: <20241017223224.3178796-1-yonghong.song@linux.dev> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20241017223138.3175885-1-yonghong.song@linux.dev> References: <20241017223138.3175885-1-yonghong.song@linux.dev> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: bpf@iogearbox.net Add two tests for struct_ops using private stack. One is with nested two different callback functions and the other is the same callback function recursing itself. For the second case, at run time, the jit trampoline recursion check kicks in to prevent the recursion. Signed-off-by: Yonghong Song --- .../selftests/bpf/bpf_testmod/bpf_testmod.c | 83 +++++++++++++++++++ .../selftests/bpf/bpf_testmod/bpf_testmod.h | 6 ++ .../bpf/prog_tests/struct_ops_private_stack.c | 80 ++++++++++++++++++ .../bpf/progs/struct_ops_private_stack.c | 62 ++++++++++++++ .../progs/struct_ops_private_stack_recur.c | 50 +++++++++++ 5 files changed, 281 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/struct_ops_private_stack.c create mode 100644 tools/testing/selftests/bpf/progs/struct_ops_private_stack.c create mode 100644 tools/testing/selftests/bpf/progs/struct_ops_private_stack_recur.c diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c index 8835761d9a12..aa61aaa847a2 100644 --- a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c +++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c @@ -245,6 +245,39 @@ __bpf_kfunc void bpf_testmod_ctx_release(struct bpf_testmod_ctx *ctx) call_rcu(&ctx->rcu, testmod_free_cb); } +static struct bpf_testmod_ops3 *st_ops3; + +static int bpf_testmod_test_3(void) +{ + return 0; +} + +static int bpf_testmod_test_4(void) +{ + return 0; +} + +static struct bpf_testmod_ops3 __bpf_testmod_ops3 = { + .test_1 = bpf_testmod_test_3, + .test_2 = bpf_testmod_test_4, +}; + +static void bpf_testmod_test_struct_ops3(void) +{ + if (st_ops3) + st_ops3->test_1(); +} + +__bpf_kfunc void bpf_testmod_ops3_call_test_1(void) +{ + st_ops3->test_1(); +} + +__bpf_kfunc void bpf_testmod_ops3_call_test_2(void) +{ + st_ops3->test_2(); +} + struct bpf_testmod_btf_type_tag_1 { int a; }; @@ -380,6 +413,8 @@ bpf_testmod_test_read(struct file *file, struct kobject *kobj, (void)bpf_testmod_test_arg_ptr_to_struct(&struct_arg1_2); + bpf_testmod_test_struct_ops3(); + struct_arg3 = kmalloc((sizeof(struct bpf_testmod_struct_arg_3) + sizeof(int)), GFP_KERNEL); if (struct_arg3 != NULL) { @@ -584,6 +619,8 @@ BTF_ID_FLAGS(func, bpf_kfunc_trusted_num_test, KF_TRUSTED_ARGS) BTF_ID_FLAGS(func, bpf_kfunc_rcu_task_test, KF_RCU) BTF_ID_FLAGS(func, bpf_testmod_ctx_create, KF_ACQUIRE | KF_RET_NULL) BTF_ID_FLAGS(func, bpf_testmod_ctx_release, KF_RELEASE) +BTF_ID_FLAGS(func, bpf_testmod_ops3_call_test_1) +BTF_ID_FLAGS(func, bpf_testmod_ops3_call_test_2) BTF_KFUNCS_END(bpf_testmod_common_kfunc_ids) BTF_ID_LIST(bpf_testmod_dtor_ids) @@ -1094,6 +1131,16 @@ static const struct bpf_verifier_ops bpf_testmod_verifier_ops = { .is_valid_access = bpf_testmod_ops_is_valid_access, }; +static bool bpf_testmod_ops3_priv_stack_allowed(void) +{ + return true; +} + +static const struct bpf_verifier_ops bpf_testmod_verifier_ops3 = { + .is_valid_access = bpf_testmod_ops_is_valid_access, + .priv_stack_allowed = bpf_testmod_ops3_priv_stack_allowed, +}; + static int bpf_dummy_reg(void *kdata, struct bpf_link *link) { struct bpf_testmod_ops *ops = kdata; @@ -1173,6 +1220,41 @@ struct bpf_struct_ops bpf_testmod_ops2 = { .owner = THIS_MODULE, }; +static int st_ops3_reg(void *kdata, struct bpf_link *link) +{ + int err = 0; + + mutex_lock(&st_ops_mutex); + if (st_ops3) { + pr_err("st_ops has already been registered\n"); + err = -EEXIST; + goto unlock; + } + st_ops3 = kdata; + +unlock: + mutex_unlock(&st_ops_mutex); + return err; +} + +static void st_ops3_unreg(void *kdata, struct bpf_link *link) +{ + mutex_lock(&st_ops_mutex); + st_ops3 = NULL; + mutex_unlock(&st_ops_mutex); +} + +struct bpf_struct_ops bpf_testmod_ops3 = { + .verifier_ops = &bpf_testmod_verifier_ops3, + .init = bpf_testmod_ops_init, + .init_member = bpf_testmod_ops_init_member, + .reg = st_ops3_reg, + .unreg = st_ops3_unreg, + .cfi_stubs = &__bpf_testmod_ops3, + .name = "bpf_testmod_ops3", + .owner = THIS_MODULE, +}; + static int bpf_test_mod_st_ops__test_prologue(struct st_ops_args *args) { return 0; @@ -1331,6 +1413,7 @@ static int bpf_testmod_init(void) ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_STRUCT_OPS, &bpf_testmod_kfunc_set); ret = ret ?: register_bpf_struct_ops(&bpf_bpf_testmod_ops, bpf_testmod_ops); ret = ret ?: register_bpf_struct_ops(&bpf_testmod_ops2, bpf_testmod_ops2); + ret = ret ?: register_bpf_struct_ops(&bpf_testmod_ops3, bpf_testmod_ops3); ret = ret ?: register_bpf_struct_ops(&testmod_st_ops, bpf_testmod_st_ops); ret = ret ?: register_btf_id_dtor_kfuncs(bpf_testmod_dtors, ARRAY_SIZE(bpf_testmod_dtors), diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.h b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.h index fb7dff47597a..59c600074eea 100644 --- a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.h +++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.h @@ -92,6 +92,12 @@ struct bpf_testmod_ops { struct bpf_testmod_ops2 { int (*test_1)(void); + int (*test_2)(void); +}; + +struct bpf_testmod_ops3 { + int (*test_1)(void); + int (*test_2)(void); }; struct st_ops_args { diff --git a/tools/testing/selftests/bpf/prog_tests/struct_ops_private_stack.c b/tools/testing/selftests/bpf/prog_tests/struct_ops_private_stack.c new file mode 100644 index 000000000000..16ea92eea2cf --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/struct_ops_private_stack.c @@ -0,0 +1,80 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include "struct_ops_private_stack.skel.h" +#include "struct_ops_private_stack_recur.skel.h" + +static void test_private_stack(void) +{ + struct struct_ops_private_stack *skel; + struct bpf_link *link; + int err; + + skel = struct_ops_private_stack__open(); + if (!ASSERT_OK_PTR(skel, "struct_ops_private_stack__open")) + return; + + if (skel->data->skip) { + test__skip(); + goto cleanup; + } + + err = struct_ops_private_stack__load(skel); + if (!ASSERT_OK(err, "struct_ops_private_stack__load")) + goto cleanup; + + link = bpf_map__attach_struct_ops(skel->maps.testmod_1); + if (!ASSERT_OK_PTR(link, "attach_struct_ops")) + goto cleanup; + + ASSERT_OK(trigger_module_test_read(256), "trigger_read"); + + ASSERT_EQ(skel->bss->val_i, 3, "val_i"); + ASSERT_EQ(skel->bss->val_j, 8, "val_j"); + + bpf_link__destroy(link); + +cleanup: + struct_ops_private_stack__destroy(skel); +} + +static void test_private_stack_recur(void) +{ + struct struct_ops_private_stack_recur *skel; + struct bpf_link *link; + int err; + + skel = struct_ops_private_stack_recur__open(); + if (!ASSERT_OK_PTR(skel, "struct_ops_private_stack_recur__open")) + return; + + if (skel->data->skip) { + test__skip(); + goto cleanup; + } + + err = struct_ops_private_stack_recur__load(skel); + if (!ASSERT_OK(err, "struct_ops_private_stack_recur__load")) + goto cleanup; + + link = bpf_map__attach_struct_ops(skel->maps.testmod_1); + if (!ASSERT_OK_PTR(link, "attach_struct_ops")) + goto cleanup; + + ASSERT_OK(trigger_module_test_read(256), "trigger_read"); + + ASSERT_EQ(skel->bss->val_j, 3, "val_j"); + + bpf_link__destroy(link); + +cleanup: + struct_ops_private_stack_recur__destroy(skel); +} + +void test_struct_ops_private_stack(void) +{ + if (test__start_subtest("private_stack")) + test_private_stack(); + if (test__start_subtest("private_stack_recur")) + test_private_stack_recur(); +} diff --git a/tools/testing/selftests/bpf/progs/struct_ops_private_stack.c b/tools/testing/selftests/bpf/progs/struct_ops_private_stack.c new file mode 100644 index 000000000000..921974263587 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/struct_ops_private_stack.c @@ -0,0 +1,62 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include +#include +#include "../bpf_testmod/bpf_testmod.h" + +char _license[] SEC("license") = "GPL"; + +#if defined(__TARGET_ARCH_x86) +bool skip __attribute((__section__(".data"))) = false; +#else +bool skip = true; +#endif + +void bpf_testmod_ops3_call_test_2(void) __ksym; + +int val_i, val_j; + +__noinline static int subprog2(int *a, int *b) +{ + return val_i + a[10] + b[20]; +} + +__noinline static int subprog1(int *a) +{ + /* stack size 400 bytes */ + volatile int b[100] = {}; + + b[20] = 2; + return subprog2(a, (int *)b); +} + + +SEC("struct_ops") +int BPF_PROG(test_1) +{ + /* stack size 400 bytes */ + volatile int a[100] = {}; + + a[10] = 1; + val_i = subprog1((int *)a); + bpf_testmod_ops3_call_test_2(); + return 0; +} + +SEC("struct_ops") +int BPF_PROG(test_2) +{ + /* stack size 400 bytes */ + volatile int a[100] = {}; + + a[10] = 3; + val_j = subprog1((int *)a); + return 0; +} + +SEC(".struct_ops") +struct bpf_testmod_ops3 testmod_1 = { + .test_1 = (void *)test_1, + .test_2 = (void *)test_2, +}; diff --git a/tools/testing/selftests/bpf/progs/struct_ops_private_stack_recur.c b/tools/testing/selftests/bpf/progs/struct_ops_private_stack_recur.c new file mode 100644 index 000000000000..c593059cea3c --- /dev/null +++ b/tools/testing/selftests/bpf/progs/struct_ops_private_stack_recur.c @@ -0,0 +1,50 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include +#include +#include "../bpf_testmod/bpf_testmod.h" + +char _license[] SEC("license") = "GPL"; + +#if defined(__TARGET_ARCH_x86) +bool skip __attribute((__section__(".data"))) = false; +#else +bool skip = true; +#endif + +void bpf_testmod_ops3_call_test_1(void) __ksym; + +int val_i, val_j; + +__noinline static int subprog2(int *a, int *b) +{ + return val_i + a[10] + b[20]; +} + +__noinline static int subprog1(int *a) +{ + /* stack size 400 bytes */ + volatile int b[100] = {}; + + b[20] = 2; + return subprog2(a, (int *)b); +} + + +SEC("struct_ops") +int BPF_PROG(test_1) +{ + /* stack size 400 bytes */ + volatile int a[100] = {}; + + a[10] = 1; + val_j += subprog1((int *)a); + bpf_testmod_ops3_call_test_1(); + return 0; +} + +SEC(".struct_ops") +struct bpf_testmod_ops3 testmod_1 = { + .test_1 = (void *)test_1, +};