diff mbox series

[bpf] riscv, bpf: Fix unpredictable kernel crash about RV64 struct_ops

Message ID 20240123023207.1917284-1-pulehui@huaweicloud.com (mailing list archive)
State Handled Elsewhere
Headers show
Series [bpf] riscv, bpf: Fix unpredictable kernel crash about RV64 struct_ops | expand

Checks

Context Check Description
conchuod/vmtest-for-next-PR success PR summary
conchuod/patch-1-test-1 success .github/scripts/patches/tests/build_rv32_defconfig.sh
conchuod/patch-1-test-2 success .github/scripts/patches/tests/build_rv64_clang_allmodconfig.sh
conchuod/patch-1-test-3 success .github/scripts/patches/tests/build_rv64_gcc_allmodconfig.sh
conchuod/patch-1-test-4 success .github/scripts/patches/tests/build_rv64_nommu_k210_defconfig.sh
conchuod/patch-1-test-5 success .github/scripts/patches/tests/build_rv64_nommu_virt_defconfig.sh
conchuod/patch-1-test-6 success .github/scripts/patches/tests/checkpatch.sh
conchuod/patch-1-test-7 success .github/scripts/patches/tests/dtb_warn_rv64.sh
conchuod/patch-1-test-8 success .github/scripts/patches/tests/header_inline.sh
conchuod/patch-1-test-9 success .github/scripts/patches/tests/kdoc.sh
conchuod/patch-1-test-10 success .github/scripts/patches/tests/module_param.sh
conchuod/patch-1-test-11 success .github/scripts/patches/tests/verify_fixes.sh
conchuod/patch-1-test-12 success .github/scripts/patches/tests/verify_signedoff.sh

Commit Message

Pu Lehui Jan. 23, 2024, 2:32 a.m. UTC
From: Pu Lehui <pulehui@huawei.com>

We encountered a kernel crash triggered by the bpf_tcp_ca testcase as
show below:

Unable to handle kernel paging request at virtual address ff60000088554500
Oops [#1]
...
CPU: 3 PID: 458 Comm: test_progs Tainted: G           OE      6.8.0-rc1-kselftest_plain #1
Hardware name: riscv-virtio,qemu (DT)
epc : 0xff60000088554500
 ra : tcp_ack+0x288/0x1232
epc : ff60000088554500 ra : ffffffff80cc7166 sp : ff2000000117ba50
 gp : ffffffff82587b60 tp : ff60000087be0040 t0 : ff60000088554500
 t1 : ffffffff801ed24e t2 : 0000000000000000 s0 : ff2000000117bbc0
 s1 : 0000000000000500 a0 : ff20000000691000 a1 : 0000000000000018
 a2 : 0000000000000001 a3 : ff60000087be03a0 a4 : 0000000000000000
 a5 : 0000000000000000 a6 : 0000000000000021 a7 : ffffffff8263f880
 s2 : 000000004ac3c13b s3 : 000000004ac3c13a s4 : 0000000000008200
 s5 : 0000000000000001 s6 : 0000000000000104 s7 : ff2000000117bb00
 s8 : ff600000885544c0 s9 : 0000000000000000 s10: ff60000086ff0b80
 s11: 000055557983a9c0 t3 : 0000000000000000 t4 : 000000000000ffc4
 t5 : ffffffff8154f170 t6 : 0000000000000030
status: 0000000200000120 badaddr: ff60000088554500 cause: 000000000000000c
Code: c796 67d7 0000 0000 0052 0002 c13b 4ac3 0000 0000 (0001) 0000
---[ end trace 0000000000000000 ]---

The reason is that commit 2cd3e3772e41 ("x86/cfi,bpf: Fix bpf_struct_ops
CFI") changes the func_addr of arch_prepare_bpf_trampoline in struct_ops
from NULL to non-NULL, while we use func_addr on RV64 to differentiate
between struct_ops and regular trampoline. When the struct_ops testcase
is triggered, it emits wrong prologue and epilogue, and lead to
unpredictable issues. After commit 2cd3e3772e41, we can use
BPF_TRAMP_F_INDIRECT to distinguish them as it always be set in
struct_ops.

Fixes: 2cd3e3772e41 ("x86/cfi,bpf: Fix bpf_struct_ops CFI")
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Tested-by: Björn Töpel <bjorn@rivosinc.com>
Acked-by: Björn Töpel <bjorn@kernel.org>
---
 arch/riscv/net/bpf_jit_comp64.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

Comments

patchwork-bot+netdevbpf@kernel.org Jan. 23, 2024, 10:41 p.m. UTC | #1
Hello:

This patch was applied to bpf/bpf.git (master)
by Daniel Borkmann <daniel@iogearbox.net>:

On Tue, 23 Jan 2024 02:32:07 +0000 you wrote:
> From: Pu Lehui <pulehui@huawei.com>
> 
> We encountered a kernel crash triggered by the bpf_tcp_ca testcase as
> show below:
> 
> Unable to handle kernel paging request at virtual address ff60000088554500
> Oops [#1]
> ...
> CPU: 3 PID: 458 Comm: test_progs Tainted: G           OE      6.8.0-rc1-kselftest_plain #1
> Hardware name: riscv-virtio,qemu (DT)
> epc : 0xff60000088554500
>  ra : tcp_ack+0x288/0x1232
> epc : ff60000088554500 ra : ffffffff80cc7166 sp : ff2000000117ba50
>  gp : ffffffff82587b60 tp : ff60000087be0040 t0 : ff60000088554500
>  t1 : ffffffff801ed24e t2 : 0000000000000000 s0 : ff2000000117bbc0
>  s1 : 0000000000000500 a0 : ff20000000691000 a1 : 0000000000000018
>  a2 : 0000000000000001 a3 : ff60000087be03a0 a4 : 0000000000000000
>  a5 : 0000000000000000 a6 : 0000000000000021 a7 : ffffffff8263f880
>  s2 : 000000004ac3c13b s3 : 000000004ac3c13a s4 : 0000000000008200
>  s5 : 0000000000000001 s6 : 0000000000000104 s7 : ff2000000117bb00
>  s8 : ff600000885544c0 s9 : 0000000000000000 s10: ff60000086ff0b80
>  s11: 000055557983a9c0 t3 : 0000000000000000 t4 : 000000000000ffc4
>  t5 : ffffffff8154f170 t6 : 0000000000000030
> status: 0000000200000120 badaddr: ff60000088554500 cause: 000000000000000c
> Code: c796 67d7 0000 0000 0052 0002 c13b 4ac3 0000 0000 (0001) 0000
> 
> [...]

Here is the summary with links:
  - [bpf] riscv, bpf: Fix unpredictable kernel crash about RV64 struct_ops
    https://git.kernel.org/bpf/bpf/c/1732ebc4a261

You are awesome, thank you!
diff mbox series

Patch

diff --git a/arch/riscv/net/bpf_jit_comp64.c b/arch/riscv/net/bpf_jit_comp64.c
index 58dc64dd94a8..719a97e7edb2 100644
--- a/arch/riscv/net/bpf_jit_comp64.c
+++ b/arch/riscv/net/bpf_jit_comp64.c
@@ -795,6 +795,7 @@  static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im,
 	struct bpf_tramp_links *fentry = &tlinks[BPF_TRAMP_FENTRY];
 	struct bpf_tramp_links *fexit = &tlinks[BPF_TRAMP_FEXIT];
 	struct bpf_tramp_links *fmod_ret = &tlinks[BPF_TRAMP_MODIFY_RETURN];
+	bool is_struct_ops = flags & BPF_TRAMP_F_INDIRECT;
 	void *orig_call = func_addr;
 	bool save_ret;
 	u32 insn;
@@ -878,7 +879,7 @@  static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im,
 
 	stack_size = round_up(stack_size, 16);
 
-	if (func_addr) {
+	if (!is_struct_ops) {
 		/* For the trampoline called from function entry,
 		 * the frame of traced function and the frame of
 		 * trampoline need to be considered.
@@ -998,7 +999,7 @@  static int __arch_prepare_bpf_trampoline(struct bpf_tramp_image *im,
 
 	emit_ld(RV_REG_S1, -sreg_off, RV_REG_FP, ctx);
 
-	if (func_addr) {
+	if (!is_struct_ops) {
 		/* trampoline called from function entry */
 		emit_ld(RV_REG_T0, stack_size - 8, RV_REG_SP, ctx);
 		emit_ld(RV_REG_FP, stack_size - 16, RV_REG_SP, ctx);