Message ID | 20221202103620.1915679-1-bjorn@kernel.org (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | BPF |
Headers | show |
Series | [bpf] bpf: Proper R0 zero-extension for BPF_CALL instructions | expand |
On Fri, 2022-12-02 at 11:36 +0100, Björn Töpel wrote: > From: Björn Töpel <bjorn@rivosinc.com> > > A BPF call instruction can be, correctly, marked with zext_dst set to > true. An example of this can be found in the BPF selftests > progs/bpf_cubic.c: > > ... > extern __u32 tcp_reno_undo_cwnd(struct sock *sk) __ksym; > > __u32 BPF_STRUCT_OPS(bpf_cubic_undo_cwnd, struct sock *sk) > { > return tcp_reno_undo_cwnd(sk); > } > ... > > which compiles to: > 0: r1 = *(u64 *)(r1 + 0x0) > 1: call -0x1 > 2: exit > > The call will be marked as zext_dst set to true, and for some > backends > (bpf_jit_needs_zext() returns true) expanded to: > 0: r1 = *(u64 *)(r1 + 0x0) > 1: call -0x1 > 2: w0 = w0 > 3: exit In the verifier, the marking is done by check_kfunc_call() (added in e6ac2450d6de), right? So the problem occurs only for kfuncs? /* Check return type */ t = btf_type_skip_modifiers(desc_btf, func_proto->type, NULL); ... if (btf_type_is_scalar(t)) { mark_reg_unknown(env, regs, BPF_REG_0); mark_btf_func_reg_size(env, BPF_REG_0, t->size); I tried to find some official information whether the eBPF calling convention requires sign- or zero- extending return values and arguments, but unfortunately [1] doesn't mention this. LLVM's lib/Target/BPF/BPFCallingConv.td mentions both R* and W* registers, but since assigning to W* leads to zero-extension, it seems to me that this is the case. If the above is correct, then shouldn't we rather use sizeof(void *) in the mark_btf_func_reg_size() call above? > The opt_subreg_zext_lo32_rnd_hi32() function which is responsible for > the zext patching, relies on insn_def_regno() to fetch the register > to > zero-extend. However, this function does not handle call instructions > correctly, and opt_subreg_zext_lo32_rnd_hi32() fails the > verification. > > Make sure that R0 is correctly resolved for (BPF_JMP | BPF_CALL) > instructions. > > Fixes: 83a2881903f3 ("bpf: Account for BPF_FETCH in > insn_has_def32()") > Signed-off-by: Björn Töpel <bjorn@rivosinc.com> > --- > I'm not super happy about the additional special case -- first > cmpxchg, and now call. :-( A more elegant/generic solution is > welcome! > --- > kernel/bpf/verifier.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > index 264b3dc714cc..4f9660eafc72 100644 > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -13386,6 +13386,9 @@ static int > opt_subreg_zext_lo32_rnd_hi32(struct bpf_verifier_env *env, > if (!bpf_jit_needs_zext() && !is_cmpxchg_insn(&insn)) > continue; > > + if (insn.code == (BPF_JMP | BPF_CALL)) > + load_reg = BPF_REG_0; > + > if (WARN_ON(load_reg == -1)) { > verbose(env, "verifier bug. zext_dst is set, > but no reg is defined\n"); > return -EFAULT; > > base-commit: 01f856ae6d0ca5ad0505b79bf2d22d7ca439b2a1 [1] https://docs.kernel.org/bpf/instruction-set.html#registers-and-calling-convention
Ilya Leoshkevich <iii@linux.ibm.com> writes: > On Fri, 2022-12-02 at 11:36 +0100, Björn Töpel wrote: >> From: Björn Töpel <bjorn@rivosinc.com> >> >> A BPF call instruction can be, correctly, marked with zext_dst set to >> true. An example of this can be found in the BPF selftests >> progs/bpf_cubic.c: >> >> ... >> extern __u32 tcp_reno_undo_cwnd(struct sock *sk) __ksym; >> >> __u32 BPF_STRUCT_OPS(bpf_cubic_undo_cwnd, struct sock *sk) >> { >> return tcp_reno_undo_cwnd(sk); >> } >> ... >> >> which compiles to: >> 0: r1 = *(u64 *)(r1 + 0x0) >> 1: call -0x1 >> 2: exit >> >> The call will be marked as zext_dst set to true, and for some >> backends >> (bpf_jit_needs_zext() returns true) expanded to: >> 0: r1 = *(u64 *)(r1 + 0x0) >> 1: call -0x1 >> 2: w0 = w0 >> 3: exit > > In the verifier, the marking is done by check_kfunc_call() (added in > e6ac2450d6de), right? So the problem occurs only for kfuncs? I've only seen it for kfuncs, yes. > > /* Check return type */ > t = btf_type_skip_modifiers(desc_btf, func_proto->type, NULL); > > ... > > if (btf_type_is_scalar(t)) { > mark_reg_unknown(env, regs, BPF_REG_0); > mark_btf_func_reg_size(env, BPF_REG_0, t->size); > > I tried to find some official information whether the eBPF calling > convention requires sign- or zero- extending return values and > arguments, but unfortunately [1] doesn't mention this. > > LLVM's lib/Target/BPF/BPFCallingConv.td mentions both R* and W* > registers, but since assigning to W* leads to zero-extension, it seems > to me that this is the case. > > If the above is correct, then shouldn't we rather use sizeof(void *) in > the mark_btf_func_reg_size() call above? Hmm, or rather sizeof(u64) if I'm reading you correctly? Thanks for having a look! Björn
On Tue, 2022-12-06 at 14:49 +0100, Björn Töpel wrote: > Ilya Leoshkevich <iii@linux.ibm.com> writes: > > > On Fri, 2022-12-02 at 11:36 +0100, Björn Töpel wrote: > > > From: Björn Töpel <bjorn@rivosinc.com> > > > > > > A BPF call instruction can be, correctly, marked with zext_dst > > > set to > > > true. An example of this can be found in the BPF selftests > > > progs/bpf_cubic.c: > > > > > > ... > > > extern __u32 tcp_reno_undo_cwnd(struct sock *sk) __ksym; > > > > > > __u32 BPF_STRUCT_OPS(bpf_cubic_undo_cwnd, struct sock *sk) > > > { > > > return tcp_reno_undo_cwnd(sk); > > > } > > > ... > > > > > > which compiles to: > > > 0: r1 = *(u64 *)(r1 + 0x0) > > > 1: call -0x1 > > > 2: exit > > > > > > The call will be marked as zext_dst set to true, and for some > > > backends > > > (bpf_jit_needs_zext() returns true) expanded to: > > > 0: r1 = *(u64 *)(r1 + 0x0) > > > 1: call -0x1 > > > 2: w0 = w0 > > > 3: exit > > > > In the verifier, the marking is done by check_kfunc_call() (added > > in > > e6ac2450d6de), right? So the problem occurs only for kfuncs? > > I've only seen it for kfuncs, yes. > > > > > /* Check return type */ > > t = btf_type_skip_modifiers(desc_btf, func_proto->type, > > NULL); > > > > ... > > > > if (btf_type_is_scalar(t)) { > > mark_reg_unknown(env, regs, BPF_REG_0); > > mark_btf_func_reg_size(env, BPF_REG_0, t->size); > > > > I tried to find some official information whether the eBPF calling > > convention requires sign- or zero- extending return values and > > arguments, but unfortunately [1] doesn't mention this. > > > > LLVM's lib/Target/BPF/BPFCallingConv.td mentions both R* and W* > > registers, but since assigning to W* leads to zero-extension, it > > seems > > to me that this is the case. > > > > If the above is correct, then shouldn't we rather use sizeof(void > > *) in > > the mark_btf_func_reg_size() call above? > > Hmm, or rather sizeof(u64) if I'm reading you correctly? Whoops, you are right - that's indeed what I meant here.
On 12/6/22 5:21 AM, Ilya Leoshkevich wrote: > On Fri, 2022-12-02 at 11:36 +0100, Björn Töpel wrote: >> From: Björn Töpel <bjorn@rivosinc.com> >> >> A BPF call instruction can be, correctly, marked with zext_dst set to >> true. An example of this can be found in the BPF selftests >> progs/bpf_cubic.c: >> >> ... >> extern __u32 tcp_reno_undo_cwnd(struct sock *sk) __ksym; >> >> __u32 BPF_STRUCT_OPS(bpf_cubic_undo_cwnd, struct sock *sk) >> { >> return tcp_reno_undo_cwnd(sk); >> } >> ... >> >> which compiles to: >> 0: r1 = *(u64 *)(r1 + 0x0) >> 1: call -0x1 >> 2: exit >> >> The call will be marked as zext_dst set to true, and for some >> backends >> (bpf_jit_needs_zext() returns true) expanded to: >> 0: r1 = *(u64 *)(r1 + 0x0) >> 1: call -0x1 >> 2: w0 = w0 >> 3: exit > > In the verifier, the marking is done by check_kfunc_call() (added in > e6ac2450d6de), right? So the problem occurs only for kfuncs? > > /* Check return type */ > t = btf_type_skip_modifiers(desc_btf, func_proto->type, NULL); > > ... > > if (btf_type_is_scalar(t)) { > mark_reg_unknown(env, regs, BPF_REG_0); > mark_btf_func_reg_size(env, BPF_REG_0, t->size); > > I tried to find some official information whether the eBPF calling > convention requires sign- or zero- extending return values and > arguments, but unfortunately [1] doesn't mention this. > > LLVM's lib/Target/BPF/BPFCallingConv.td mentions both R* and W* > registers, but since assigning to W* leads to zero-extension, it seems > to me that this is the case. We actually follow the clang convention, the zero-extension is either done in caller or callee, but not both. See https://reviews.llvm.org/D131598 how the convention could be changed. The following is an example. $ cat t.c extern unsigned foo(void); unsigned bar1(void) { return foo(); } unsigned bar2(void) { if (foo()) return 10; else return 20; } $ clang -target bpf -mcpu=v3 -O2 -c t.c && llvm-objdump -d t.o t.o: file format elf64-bpf Disassembly of section .text: 0000000000000000 <bar1>: 0: 85 10 00 00 ff ff ff ff call -0x1 1: 95 00 00 00 00 00 00 00 exit 0000000000000010 <bar2>: 2: 85 10 00 00 ff ff ff ff call -0x1 3: bc 01 00 00 00 00 00 00 w1 = w0 4: b4 00 00 00 14 00 00 00 w0 = 0x14 5: 16 01 01 00 00 00 00 00 if w1 == 0x0 goto +0x1 <LBB1_2> 6: b4 00 00 00 0a 00 00 00 w0 = 0xa 0000000000000038 <LBB1_2>: 7: 95 00 00 00 00 00 00 00 exit $ If the return value of 'foo()' is actually used in the bpf program, the proper zero extension will be done. Otherwise, it is not done. This is with latest llvm16. I guess we need to check llvm whether we could enforce to add a w0 = w0 in bar1(). Otherwise, with this patch, it will add w0 = w0 in all cases which is not necessary in most of practical cases. > > If the above is correct, then shouldn't we rather use sizeof(void *) in > the mark_btf_func_reg_size() call above? > >> The opt_subreg_zext_lo32_rnd_hi32() function which is responsible for >> the zext patching, relies on insn_def_regno() to fetch the register >> to >> zero-extend. However, this function does not handle call instructions >> correctly, and opt_subreg_zext_lo32_rnd_hi32() fails the >> verification. >> >> Make sure that R0 is correctly resolved for (BPF_JMP | BPF_CALL) >> instructions. >> >> Fixes: 83a2881903f3 ("bpf: Account for BPF_FETCH in >> insn_has_def32()") >> Signed-off-by: Björn Töpel <bjorn@rivosinc.com> >> --- >> I'm not super happy about the additional special case -- first >> cmpxchg, and now call. :-( A more elegant/generic solution is >> welcome! >> --- >> kernel/bpf/verifier.c | 3 +++ >> 1 file changed, 3 insertions(+) >> >> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c >> index 264b3dc714cc..4f9660eafc72 100644 >> --- a/kernel/bpf/verifier.c >> +++ b/kernel/bpf/verifier.c >> @@ -13386,6 +13386,9 @@ static int >> opt_subreg_zext_lo32_rnd_hi32(struct bpf_verifier_env *env, >> if (!bpf_jit_needs_zext() && !is_cmpxchg_insn(&insn)) >> continue; >> >> + if (insn.code == (BPF_JMP | BPF_CALL)) >> + load_reg = BPF_REG_0; >> + >> if (WARN_ON(load_reg == -1)) { >> verbose(env, "verifier bug. zext_dst is set, >> but no reg is defined\n"); >> return -EFAULT; >> >> base-commit: 01f856ae6d0ca5ad0505b79bf2d22d7ca439b2a1 > > [1] > https://docs.kernel.org/bpf/instruction-set.html#registers-and-calling-convention
Yonghong Song <yhs@meta.com> writes: > On 12/6/22 5:21 AM, Ilya Leoshkevich wrote: >> On Fri, 2022-12-02 at 11:36 +0100, Björn Töpel wrote: >>> From: Björn Töpel <bjorn@rivosinc.com> >>> >>> A BPF call instruction can be, correctly, marked with zext_dst set to >>> true. An example of this can be found in the BPF selftests >>> progs/bpf_cubic.c: >>> >>> ... >>> extern __u32 tcp_reno_undo_cwnd(struct sock *sk) __ksym; >>> >>> __u32 BPF_STRUCT_OPS(bpf_cubic_undo_cwnd, struct sock *sk) >>> { >>> return tcp_reno_undo_cwnd(sk); >>> } >>> ... >>> >>> which compiles to: >>> 0: r1 = *(u64 *)(r1 + 0x0) >>> 1: call -0x1 >>> 2: exit >>> >>> The call will be marked as zext_dst set to true, and for some >>> backends >>> (bpf_jit_needs_zext() returns true) expanded to: >>> 0: r1 = *(u64 *)(r1 + 0x0) >>> 1: call -0x1 >>> 2: w0 = w0 >>> 3: exit >> >> In the verifier, the marking is done by check_kfunc_call() (added in >> e6ac2450d6de), right? So the problem occurs only for kfuncs? >> >> /* Check return type */ >> t = btf_type_skip_modifiers(desc_btf, func_proto->type, NULL); >> >> ... >> >> if (btf_type_is_scalar(t)) { >> mark_reg_unknown(env, regs, BPF_REG_0); >> mark_btf_func_reg_size(env, BPF_REG_0, t->size); >> >> I tried to find some official information whether the eBPF calling >> convention requires sign- or zero- extending return values and >> arguments, but unfortunately [1] doesn't mention this. >> >> LLVM's lib/Target/BPF/BPFCallingConv.td mentions both R* and W* >> registers, but since assigning to W* leads to zero-extension, it seems >> to me that this is the case. > > We actually follow the clang convention, the zero-extension is either > done in caller or callee, but not both. See > https://reviews.llvm.org/D131598 how the convention could be changed. > > The following is an example. > > $ cat t.c > extern unsigned foo(void); > unsigned bar1(void) { > return foo(); > } > unsigned bar2(void) { > if (foo()) return 10; else return 20; > } > $ clang -target bpf -mcpu=v3 -O2 -c t.c && llvm-objdump -d t.o > > t.o: file format elf64-bpf > > Disassembly of section .text: > > 0000000000000000 <bar1>: > 0: 85 10 00 00 ff ff ff ff call -0x1 > 1: 95 00 00 00 00 00 00 00 exit > > 0000000000000010 <bar2>: > 2: 85 10 00 00 ff ff ff ff call -0x1 > 3: bc 01 00 00 00 00 00 00 w1 = w0 > 4: b4 00 00 00 14 00 00 00 w0 = 0x14 > 5: 16 01 01 00 00 00 00 00 if w1 == 0x0 goto +0x1 <LBB1_2> > 6: b4 00 00 00 0a 00 00 00 w0 = 0xa > > 0000000000000038 <LBB1_2>: > 7: 95 00 00 00 00 00 00 00 exit > $ > > If the return value of 'foo()' is actually used in the bpf program, the > proper zero extension will be done. Otherwise, it is not done. > > This is with latest llvm16. I guess we need to check llvm whether > we could enforce to add a w0 = w0 in bar1(). > > Otherwise, with this patch, it will add w0 = w0 in all cases which > is not necessary in most of practical cases. Thanks, Yonghong! So, what would the correct fix be? We don't want the verifier to mark the call for zext_dst in my commit message example, since the zext will be properly done by LLVM. Wdyt about Ilya's suggestion marking R0 as 64b? That avoids hitting my "verifier bug", but I'm not well versed enough in verifier land to say whether that breaks something else... I.e. is setting reg->subreg_def to DEF_NOT_SUBREG for R0 correct?
On 12/6/22 9:47 AM, Yonghong Song wrote: > > > On 12/6/22 5:21 AM, Ilya Leoshkevich wrote: >> On Fri, 2022-12-02 at 11:36 +0100, Björn Töpel wrote: >>> From: Björn Töpel <bjorn@rivosinc.com> >>> >>> A BPF call instruction can be, correctly, marked with zext_dst set to >>> true. An example of this can be found in the BPF selftests >>> progs/bpf_cubic.c: >>> >>> ... >>> extern __u32 tcp_reno_undo_cwnd(struct sock *sk) __ksym; >>> >>> __u32 BPF_STRUCT_OPS(bpf_cubic_undo_cwnd, struct sock *sk) >>> { >>> return tcp_reno_undo_cwnd(sk); >>> } >>> ... >>> >>> which compiles to: >>> 0: r1 = *(u64 *)(r1 + 0x0) >>> 1: call -0x1 >>> 2: exit >>> >>> The call will be marked as zext_dst set to true, and for some >>> backends >>> (bpf_jit_needs_zext() returns true) expanded to: >>> 0: r1 = *(u64 *)(r1 + 0x0) >>> 1: call -0x1 >>> 2: w0 = w0 >>> 3: exit >> >> In the verifier, the marking is done by check_kfunc_call() (added in >> e6ac2450d6de), right? So the problem occurs only for kfuncs? >> >> /* Check return type */ >> t = btf_type_skip_modifiers(desc_btf, func_proto->type, NULL); >> >> ... >> >> if (btf_type_is_scalar(t)) { >> mark_reg_unknown(env, regs, BPF_REG_0); >> mark_btf_func_reg_size(env, BPF_REG_0, t->size); >> >> I tried to find some official information whether the eBPF calling >> convention requires sign- or zero- extending return values and >> arguments, but unfortunately [1] doesn't mention this. >> >> LLVM's lib/Target/BPF/BPFCallingConv.td mentions both R* and W* >> registers, but since assigning to W* leads to zero-extension, it seems >> to me that this is the case. > > We actually follow the clang convention, the zero-extension is either > done in caller or callee, but not both. See > https://reviews.llvm.org/D131598 how the convention could be changed. > > The following is an example. > > $ cat t.c > extern unsigned foo(void); > unsigned bar1(void) { > return foo(); > } > unsigned bar2(void) { > if (foo()) return 10; else return 20; > } > $ clang -target bpf -mcpu=v3 -O2 -c t.c && llvm-objdump -d t.o > > t.o: file format elf64-bpf > > Disassembly of section .text: > > 0000000000000000 <bar1>: > 0: 85 10 00 00 ff ff ff ff call -0x1 > 1: 95 00 00 00 00 00 00 00 exit > > 0000000000000010 <bar2>: > 2: 85 10 00 00 ff ff ff ff call -0x1 > 3: bc 01 00 00 00 00 00 00 w1 = w0 > 4: b4 00 00 00 14 00 00 00 w0 = 0x14 > 5: 16 01 01 00 00 00 00 00 if w1 == 0x0 goto +0x1 <LBB1_2> > 6: b4 00 00 00 0a 00 00 00 w0 = 0xa > > 0000000000000038 <LBB1_2>: > 7: 95 00 00 00 00 00 00 00 exit > $ > > If the return value of 'foo()' is actually used in the bpf program, the > proper zero extension will be done. Otherwise, it is not done. > > This is with latest llvm16. I guess we need to check llvm whether > we could enforce to add a w0 = w0 in bar1(). > > Otherwise, with this patch, it will add w0 = w0 in all cases which > is not necessary in most of practical cases. > >> >> If the above is correct, then shouldn't we rather use sizeof(void *) in >> the mark_btf_func_reg_size() call above? >> >>> The opt_subreg_zext_lo32_rnd_hi32() function which is responsible for >>> the zext patching, relies on insn_def_regno() to fetch the register >>> to >>> zero-extend. However, this function does not handle call instructions >>> correctly, and opt_subreg_zext_lo32_rnd_hi32() fails the >>> verification. >>> >>> Make sure that R0 is correctly resolved for (BPF_JMP | BPF_CALL) >>> instructions. >>> >>> Fixes: 83a2881903f3 ("bpf: Account for BPF_FETCH in >>> insn_has_def32()") >>> Signed-off-by: Björn Töpel <bjorn@rivosinc.com> >>> --- >>> I'm not super happy about the additional special case -- first >>> cmpxchg, and now call. :-( A more elegant/generic solution is >>> welcome! >>> --- >>> kernel/bpf/verifier.c | 3 +++ >>> 1 file changed, 3 insertions(+) >>> >>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c >>> index 264b3dc714cc..4f9660eafc72 100644 >>> --- a/kernel/bpf/verifier.c >>> +++ b/kernel/bpf/verifier.c >>> @@ -13386,6 +13386,9 @@ static int >>> opt_subreg_zext_lo32_rnd_hi32(struct bpf_verifier_env *env, >>> if (!bpf_jit_needs_zext() && !is_cmpxchg_insn(&insn)) >>> continue; >>> + if (insn.code == (BPF_JMP | BPF_CALL)) >>> + load_reg = BPF_REG_0; Want to double check. Do we actually have a problem here? For example, on x64, we probably won't have this issue. >>> ... >>> extern __u32 tcp_reno_undo_cwnd(struct sock *sk) __ksym; >>> >>> __u32 BPF_STRUCT_OPS(bpf_cubic_undo_cwnd, struct sock *sk) >>> { >>> return tcp_reno_undo_cwnd(sk); >>> } The native code will return a 32-bit subreg to bpf program, and bpf didn't do anything and return r0 to the kernel func. In the kernel func, the kernel will take 32-bit subreg by x86_64 convention. This applies to some other return types like u8/s8/u16/s16/u32/s32. Which architecture you actually see the issue? >>> + >>> if (WARN_ON(load_reg == -1)) { >>> verbose(env, "verifier bug. zext_dst is set, >>> but no reg is defined\n"); >>> return -EFAULT; >>> >>> base-commit: 01f856ae6d0ca5ad0505b79bf2d22d7ca439b2a1 >> >> [1] >> https://docs.kernel.org/bpf/instruction-set.html#registers-and-calling-convention
Yonghong Song <yhs@meta.com> writes: > On 12/6/22 9:47 AM, Yonghong Song wrote: >> >> >> On 12/6/22 5:21 AM, Ilya Leoshkevich wrote: >>> On Fri, 2022-12-02 at 11:36 +0100, Björn Töpel wrote: >>>> From: Björn Töpel <bjorn@rivosinc.com> >>>> >>>> A BPF call instruction can be, correctly, marked with zext_dst set to >>>> true. An example of this can be found in the BPF selftests >>>> progs/bpf_cubic.c: >>>> >>>> ... >>>> extern __u32 tcp_reno_undo_cwnd(struct sock *sk) __ksym; >>>> >>>> __u32 BPF_STRUCT_OPS(bpf_cubic_undo_cwnd, struct sock *sk) >>>> { >>>> return tcp_reno_undo_cwnd(sk); >>>> } >>>> ... >>>> >>>> which compiles to: >>>> 0: r1 = *(u64 *)(r1 + 0x0) >>>> 1: call -0x1 >>>> 2: exit >>>> >>>> The call will be marked as zext_dst set to true, and for some >>>> backends >>>> (bpf_jit_needs_zext() returns true) expanded to: >>>> 0: r1 = *(u64 *)(r1 + 0x0) >>>> 1: call -0x1 >>>> 2: w0 = w0 >>>> 3: exit >>> >>> In the verifier, the marking is done by check_kfunc_call() (added in >>> e6ac2450d6de), right? So the problem occurs only for kfuncs? >>> >>> /* Check return type */ >>> t = btf_type_skip_modifiers(desc_btf, func_proto->type, NULL); >>> >>> ... >>> >>> if (btf_type_is_scalar(t)) { >>> mark_reg_unknown(env, regs, BPF_REG_0); >>> mark_btf_func_reg_size(env, BPF_REG_0, t->size); >>> >>> I tried to find some official information whether the eBPF calling >>> convention requires sign- or zero- extending return values and >>> arguments, but unfortunately [1] doesn't mention this. >>> >>> LLVM's lib/Target/BPF/BPFCallingConv.td mentions both R* and W* >>> registers, but since assigning to W* leads to zero-extension, it seems >>> to me that this is the case. >> >> We actually follow the clang convention, the zero-extension is either >> done in caller or callee, but not both. See >> https://reviews.llvm.org/D131598 how the convention could be changed. >> >> The following is an example. >> >> $ cat t.c >> extern unsigned foo(void); >> unsigned bar1(void) { >> return foo(); >> } >> unsigned bar2(void) { >> if (foo()) return 10; else return 20; >> } >> $ clang -target bpf -mcpu=v3 -O2 -c t.c && llvm-objdump -d t.o >> >> t.o: file format elf64-bpf >> >> Disassembly of section .text: >> >> 0000000000000000 <bar1>: >> 0: 85 10 00 00 ff ff ff ff call -0x1 >> 1: 95 00 00 00 00 00 00 00 exit >> >> 0000000000000010 <bar2>: >> 2: 85 10 00 00 ff ff ff ff call -0x1 >> 3: bc 01 00 00 00 00 00 00 w1 = w0 >> 4: b4 00 00 00 14 00 00 00 w0 = 0x14 >> 5: 16 01 01 00 00 00 00 00 if w1 == 0x0 goto +0x1 <LBB1_2> >> 6: b4 00 00 00 0a 00 00 00 w0 = 0xa >> >> 0000000000000038 <LBB1_2>: >> 7: 95 00 00 00 00 00 00 00 exit >> $ >> >> If the return value of 'foo()' is actually used in the bpf program, the >> proper zero extension will be done. Otherwise, it is not done. >> >> This is with latest llvm16. I guess we need to check llvm whether >> we could enforce to add a w0 = w0 in bar1(). >> >> Otherwise, with this patch, it will add w0 = w0 in all cases which >> is not necessary in most of practical cases. >> >>> >>> If the above is correct, then shouldn't we rather use sizeof(void *) in >>> the mark_btf_func_reg_size() call above? >>> >>>> The opt_subreg_zext_lo32_rnd_hi32() function which is responsible for >>>> the zext patching, relies on insn_def_regno() to fetch the register >>>> to >>>> zero-extend. However, this function does not handle call instructions >>>> correctly, and opt_subreg_zext_lo32_rnd_hi32() fails the >>>> verification. >>>> >>>> Make sure that R0 is correctly resolved for (BPF_JMP | BPF_CALL) >>>> instructions. >>>> >>>> Fixes: 83a2881903f3 ("bpf: Account for BPF_FETCH in >>>> insn_has_def32()") >>>> Signed-off-by: Björn Töpel <bjorn@rivosinc.com> >>>> --- >>>> I'm not super happy about the additional special case -- first >>>> cmpxchg, and now call. :-( A more elegant/generic solution is >>>> welcome! >>>> --- >>>> kernel/bpf/verifier.c | 3 +++ >>>> 1 file changed, 3 insertions(+) >>>> >>>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c >>>> index 264b3dc714cc..4f9660eafc72 100644 >>>> --- a/kernel/bpf/verifier.c >>>> +++ b/kernel/bpf/verifier.c >>>> @@ -13386,6 +13386,9 @@ static int >>>> opt_subreg_zext_lo32_rnd_hi32(struct bpf_verifier_env *env, >>>> if (!bpf_jit_needs_zext() && !is_cmpxchg_insn(&insn)) >>>> continue; >>>> + if (insn.code == (BPF_JMP | BPF_CALL)) >>>> + load_reg = BPF_REG_0; > > Want to double check. Do we actually have a problem here? > For example, on x64, we probably won't have this issue. The "problem" is that I hit this: if (WARN_ON(load_reg == -1)) { verbose(env, "verifier bug. zext_dst is set, but no reg is defined\n"); return -EFAULT; } This path is only taken for archs which have bpf_jit_needs_zext() == true. In my case it's riscv64, but it should hit i386, sparc, s390, ppc, mips, and arm. My reading of this thread has been that "marking the call has zext_dst=true, is incorrect", i.e. that LLVM will insert the correct zext instructions. So, on way of not hitting this path, is what Ilya suggest -- in check_kfunc_call(): if (btf_type_is_scalar(t)) { mark_reg_unknown(env, regs, BPF_REG_0); mark_btf_func_reg_size(env, BPF_REG_0, t->size); } change t->size to sizeof(u64). Then the call wont be marked. > >>> ... > >>> extern __u32 tcp_reno_undo_cwnd(struct sock *sk) __ksym; > >>> > >>> __u32 BPF_STRUCT_OPS(bpf_cubic_undo_cwnd, struct sock *sk) > >>> { > >>> return tcp_reno_undo_cwnd(sk); > >>> } > > The native code will return a 32-bit subreg to bpf program, > and bpf didn't do anything and return r0 to the kernel func. > In the kernel func, the kernel will take 32-bit subreg by > x86_64 convention. This applies to some other return types > like u8/s8/u16/s16/u32/s32. > > Which architecture you actually see the issue? This is riscv64, but the nature of the problem is more of an assertion failure, than codegen AFAIK. I hit is when I load progs/bpf_cubic.o from the selftest. Nightly clang from apt.llvm.org: clang version 16.0.0 (++20221204034339+7a194cfb327a-1~exp1~20221204154444.167) Björn
On 12/6/22 10:38 AM, Björn Töpel wrote: > Yonghong Song <yhs@meta.com> writes: > >> On 12/6/22 9:47 AM, Yonghong Song wrote: >>> >>> >>> On 12/6/22 5:21 AM, Ilya Leoshkevich wrote: >>>> On Fri, 2022-12-02 at 11:36 +0100, Björn Töpel wrote: >>>>> From: Björn Töpel <bjorn@rivosinc.com> >>>>> >>>>> A BPF call instruction can be, correctly, marked with zext_dst set to >>>>> true. An example of this can be found in the BPF selftests >>>>> progs/bpf_cubic.c: >>>>> >>>>> ... >>>>> extern __u32 tcp_reno_undo_cwnd(struct sock *sk) __ksym; >>>>> >>>>> __u32 BPF_STRUCT_OPS(bpf_cubic_undo_cwnd, struct sock *sk) >>>>> { >>>>> return tcp_reno_undo_cwnd(sk); >>>>> } >>>>> ... >>>>> >>>>> which compiles to: >>>>> 0: r1 = *(u64 *)(r1 + 0x0) >>>>> 1: call -0x1 >>>>> 2: exit >>>>> >>>>> The call will be marked as zext_dst set to true, and for some >>>>> backends >>>>> (bpf_jit_needs_zext() returns true) expanded to: >>>>> 0: r1 = *(u64 *)(r1 + 0x0) >>>>> 1: call -0x1 >>>>> 2: w0 = w0 >>>>> 3: exit >>>> >>>> In the verifier, the marking is done by check_kfunc_call() (added in >>>> e6ac2450d6de), right? So the problem occurs only for kfuncs? >>>> >>>> /* Check return type */ >>>> t = btf_type_skip_modifiers(desc_btf, func_proto->type, NULL); >>>> >>>> ... >>>> >>>> if (btf_type_is_scalar(t)) { >>>> mark_reg_unknown(env, regs, BPF_REG_0); >>>> mark_btf_func_reg_size(env, BPF_REG_0, t->size); >>>> >>>> I tried to find some official information whether the eBPF calling >>>> convention requires sign- or zero- extending return values and >>>> arguments, but unfortunately [1] doesn't mention this. >>>> >>>> LLVM's lib/Target/BPF/BPFCallingConv.td mentions both R* and W* >>>> registers, but since assigning to W* leads to zero-extension, it seems >>>> to me that this is the case. >>> >>> We actually follow the clang convention, the zero-extension is either >>> done in caller or callee, but not both. See >>> https://reviews.llvm.org/D131598 how the convention could be changed. >>> >>> The following is an example. >>> >>> $ cat t.c >>> extern unsigned foo(void); >>> unsigned bar1(void) { >>> return foo(); >>> } >>> unsigned bar2(void) { >>> if (foo()) return 10; else return 20; >>> } >>> $ clang -target bpf -mcpu=v3 -O2 -c t.c && llvm-objdump -d t.o >>> >>> t.o: file format elf64-bpf >>> >>> Disassembly of section .text: >>> >>> 0000000000000000 <bar1>: >>> 0: 85 10 00 00 ff ff ff ff call -0x1 >>> 1: 95 00 00 00 00 00 00 00 exit >>> >>> 0000000000000010 <bar2>: >>> 2: 85 10 00 00 ff ff ff ff call -0x1 >>> 3: bc 01 00 00 00 00 00 00 w1 = w0 >>> 4: b4 00 00 00 14 00 00 00 w0 = 0x14 >>> 5: 16 01 01 00 00 00 00 00 if w1 == 0x0 goto +0x1 <LBB1_2> >>> 6: b4 00 00 00 0a 00 00 00 w0 = 0xa >>> >>> 0000000000000038 <LBB1_2>: >>> 7: 95 00 00 00 00 00 00 00 exit >>> $ >>> >>> If the return value of 'foo()' is actually used in the bpf program, the >>> proper zero extension will be done. Otherwise, it is not done. >>> >>> This is with latest llvm16. I guess we need to check llvm whether >>> we could enforce to add a w0 = w0 in bar1(). >>> >>> Otherwise, with this patch, it will add w0 = w0 in all cases which >>> is not necessary in most of practical cases. >>> >>>> >>>> If the above is correct, then shouldn't we rather use sizeof(void *) in >>>> the mark_btf_func_reg_size() call above? >>>> >>>>> The opt_subreg_zext_lo32_rnd_hi32() function which is responsible for >>>>> the zext patching, relies on insn_def_regno() to fetch the register >>>>> to >>>>> zero-extend. However, this function does not handle call instructions >>>>> correctly, and opt_subreg_zext_lo32_rnd_hi32() fails the >>>>> verification. >>>>> >>>>> Make sure that R0 is correctly resolved for (BPF_JMP | BPF_CALL) >>>>> instructions. >>>>> >>>>> Fixes: 83a2881903f3 ("bpf: Account for BPF_FETCH in >>>>> insn_has_def32()") >>>>> Signed-off-by: Björn Töpel <bjorn@rivosinc.com> >>>>> --- >>>>> I'm not super happy about the additional special case -- first >>>>> cmpxchg, and now call. :-( A more elegant/generic solution is >>>>> welcome! >>>>> --- >>>>> kernel/bpf/verifier.c | 3 +++ >>>>> 1 file changed, 3 insertions(+) >>>>> >>>>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c >>>>> index 264b3dc714cc..4f9660eafc72 100644 >>>>> --- a/kernel/bpf/verifier.c >>>>> +++ b/kernel/bpf/verifier.c >>>>> @@ -13386,6 +13386,9 @@ static int >>>>> opt_subreg_zext_lo32_rnd_hi32(struct bpf_verifier_env *env, >>>>> if (!bpf_jit_needs_zext() && !is_cmpxchg_insn(&insn)) >>>>> continue; >>>>> + if (insn.code == (BPF_JMP | BPF_CALL)) >>>>> + load_reg = BPF_REG_0; >> >> Want to double check. Do we actually have a problem here? >> For example, on x64, we probably won't have this issue. > > The "problem" is that I hit this: > if (WARN_ON(load_reg == -1)) { > verbose(env, "verifier bug. zext_dst is set, but no reg is defined\n"); > return -EFAULT; > } > > This path is only taken for archs which have bpf_jit_needs_zext() == > true. In my case it's riscv64, but it should hit i386, sparc, s390, ppc, > mips, and arm. > > My reading of this thread has been that "marking the call has > zext_dst=true, is incorrect", i.e. that LLVM will insert the correct > zext instructions. Your interpretation is correct. Yes, for func return values, the llvm will insert correct zext/sext instructions if the return value is used. Otherwise, if the return value simply passes through, the caller call site should handle that properly. So, yes changing t->size to sizeof(u64) in below code in check_kfunc_call() should work. But the fix sounds like a hack and we might have some side effect during verification, now or future. Maybe we could check BPF_PSEUDO_KFUNC_CALL in appropriate place to prevent zext. > > So, on way of not hitting this path, is what Ilya suggest -- in > check_kfunc_call(): > > if (btf_type_is_scalar(t)) { > mark_reg_unknown(env, regs, BPF_REG_0); > mark_btf_func_reg_size(env, BPF_REG_0, t->size); > } > > change t->size to sizeof(u64). Then the call wont be marked. > >> >>> ... >> >>> extern __u32 tcp_reno_undo_cwnd(struct sock *sk) __ksym; >> >>> >> >>> __u32 BPF_STRUCT_OPS(bpf_cubic_undo_cwnd, struct sock *sk) >> >>> { >> >>> return tcp_reno_undo_cwnd(sk); >> >>> } >> >> The native code will return a 32-bit subreg to bpf program, >> and bpf didn't do anything and return r0 to the kernel func. >> In the kernel func, the kernel will take 32-bit subreg by >> x86_64 convention. This applies to some other return types >> like u8/s8/u16/s16/u32/s32. >> >> Which architecture you actually see the issue? > > This is riscv64, but the nature of the problem is more of an assertion > failure, than codegen AFAIK. > > I hit is when I load progs/bpf_cubic.o from the selftest. Nightly clang > from apt.llvm.org: clang version 16.0.0 > (++20221204034339+7a194cfb327a-1~exp1~20221204154444.167) > > > Björn
Yonghong Song <yhs@meta.com> writes: >>>>>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c >>>>>> index 264b3dc714cc..4f9660eafc72 100644 >>>>>> --- a/kernel/bpf/verifier.c >>>>>> +++ b/kernel/bpf/verifier.c >>>>>> @@ -13386,6 +13386,9 @@ static int >>>>>> opt_subreg_zext_lo32_rnd_hi32(struct bpf_verifier_env *env, >>>>>> if (!bpf_jit_needs_zext() && !is_cmpxchg_insn(&insn)) >>>>>> continue; >>>>>> + if (insn.code == (BPF_JMP | BPF_CALL)) >>>>>> + load_reg = BPF_REG_0; >>> >>> Want to double check. Do we actually have a problem here? >>> For example, on x64, we probably won't have this issue. >> >> The "problem" is that I hit this: >> if (WARN_ON(load_reg == -1)) { >> verbose(env, "verifier bug. zext_dst is set, but no reg is defined\n"); >> return -EFAULT; >> } >> >> This path is only taken for archs which have bpf_jit_needs_zext() == >> true. In my case it's riscv64, but it should hit i386, sparc, s390, ppc, >> mips, and arm. >> >> My reading of this thread has been that "marking the call has >> zext_dst=true, is incorrect", i.e. that LLVM will insert the correct >> zext instructions. > > Your interpretation is correct. Yes, for func return values, the > llvm will insert correct zext/sext instructions if the return > value is used. Otherwise, if the return value simply passes > through, the caller call site should handle that properly. > > So, yes changing t->size to sizeof(u64) in below code in > check_kfunc_call() should work. But the fix sounds like a hack > and we might have some side effect during verification, now > or future. > > Maybe we could check BPF_PSEUDO_KFUNC_CALL in appropriate place to > prevent zext. Thanks for all the input! I'll digest it, and get back with a v2.
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 264b3dc714cc..4f9660eafc72 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -13386,6 +13386,9 @@ static int opt_subreg_zext_lo32_rnd_hi32(struct bpf_verifier_env *env, if (!bpf_jit_needs_zext() && !is_cmpxchg_insn(&insn)) continue; + if (insn.code == (BPF_JMP | BPF_CALL)) + load_reg = BPF_REG_0; + if (WARN_ON(load_reg == -1)) { verbose(env, "verifier bug. zext_dst is set, but no reg is defined\n"); return -EFAULT;