Message ID | 20230713023232.1411523-8-memxor@gmail.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | BPF |
Headers | show |
Series | Exceptions - 1/2 | expand |
On Thu, Jul 13, 2023 at 08:02:29AM +0530, Kumar Kartikeya Dwivedi wrote: > Now that we allow exception throwing using bpf_throw kfunc, it can > appear as the final instruction in a prog. When this happens, and we > begin to unwind the stack using arch_bpf_stack_walk, the instruction > pointer (IP) may appear to lie outside the JITed instructions. This > happens because the return address is the instruction following the > call, but the bpf_throw never returns to the program, so the JIT > considers instruction ending at the bpf_throw call as the final JITed > instruction and end of the jited_length for the program. > > This becomes a problem when we search the IP using is_bpf_text_address > and bpf_prog_ksym_find, both of which use bpf_ksym_find under the hood, > and it rightfully considers addr == ksym.end to be outside the program's > boundaries. > > Insert a dummy 'int3' instruction which will never be hit to bump the > jited_length and allow us to handle programs with their final > isntruction being a call to bpf_throw. > > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> > --- > arch/x86/net/bpf_jit_comp.c | 11 +++++++++++ > include/linux/bpf.h | 2 ++ > 2 files changed, 13 insertions(+) > > diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c > index 8d97c6a60f9a..052230cc7f50 100644 > --- a/arch/x86/net/bpf_jit_comp.c > +++ b/arch/x86/net/bpf_jit_comp.c > @@ -1579,6 +1579,17 @@ st: if (is_imm8(insn->off)) > } > if (emit_call(&prog, func, image + addrs[i - 1] + offs)) > return -EINVAL; > + /* Similar to BPF_EXIT_INSN, call for bpf_throw may be > + * the final instruction in the program. Insert an int3 > + * following the call instruction so that we can still > + * detect pc to be part of the bpf_prog in > + * bpf_ksym_find, otherwise when this is the last > + * instruction (as allowed by verifier, similar to exit > + * and jump instructions), pc will be == ksym.end, > + * leading to bpf_throw failing to unwind the stack. > + */ > + if (func == (u8 *)&bpf_throw) > + EMIT1(0xCC); /* int3 */ Probably worth explaining that this happens because bpf_throw is marked __attribute__((noreturn)) and compiler can emit it last without BPF_EXIT insn. Meaing the program might not have BPF_EXIT at all. I wonder though whether this self-inflicted pain is worth it. May be it shouldn't be marked as noreturn. What do we gain by marking?
On Sat, 15 Jul 2023 at 04:09, Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > On Thu, Jul 13, 2023 at 08:02:29AM +0530, Kumar Kartikeya Dwivedi wrote: > > Now that we allow exception throwing using bpf_throw kfunc, it can > > appear as the final instruction in a prog. When this happens, and we > > begin to unwind the stack using arch_bpf_stack_walk, the instruction > > pointer (IP) may appear to lie outside the JITed instructions. This > > happens because the return address is the instruction following the > > call, but the bpf_throw never returns to the program, so the JIT > > considers instruction ending at the bpf_throw call as the final JITed > > instruction and end of the jited_length for the program. > > > > This becomes a problem when we search the IP using is_bpf_text_address > > and bpf_prog_ksym_find, both of which use bpf_ksym_find under the hood, > > and it rightfully considers addr == ksym.end to be outside the program's > > boundaries. > > > > Insert a dummy 'int3' instruction which will never be hit to bump the > > jited_length and allow us to handle programs with their final > > isntruction being a call to bpf_throw. > > > > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> > > --- > > arch/x86/net/bpf_jit_comp.c | 11 +++++++++++ > > include/linux/bpf.h | 2 ++ > > 2 files changed, 13 insertions(+) > > > > diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c > > index 8d97c6a60f9a..052230cc7f50 100644 > > --- a/arch/x86/net/bpf_jit_comp.c > > +++ b/arch/x86/net/bpf_jit_comp.c > > @@ -1579,6 +1579,17 @@ st: if (is_imm8(insn->off)) > > } > > if (emit_call(&prog, func, image + addrs[i - 1] + offs)) > > return -EINVAL; > > + /* Similar to BPF_EXIT_INSN, call for bpf_throw may be > > + * the final instruction in the program. Insert an int3 > > + * following the call instruction so that we can still > > + * detect pc to be part of the bpf_prog in > > + * bpf_ksym_find, otherwise when this is the last > > + * instruction (as allowed by verifier, similar to exit > > + * and jump instructions), pc will be == ksym.end, > > + * leading to bpf_throw failing to unwind the stack. > > + */ > > + if (func == (u8 *)&bpf_throw) > > + EMIT1(0xCC); /* int3 */ > > Probably worth explaining that this happens because bpf_throw is marked > __attribute__((noreturn)) and compiler can emit it last without BPF_EXIT insn. > Meaing the program might not have BPF_EXIT at all. Yes, sorry about omitting that. I will add it to the commit message in v2. > > I wonder though whether this self-inflicted pain is worth it. > May be it shouldn't be marked as noreturn. > What do we gain by marking? It felt like the obvious thing to do to me. The cost on the kernel side is negligible (atleast in my opinion), we just have to allow it as final instruction in the program. If it's heavily used it allows the compiler to better optimize the code (marking anything after it unreachable, no need to save registers etc., although this may not be a persuasive point for you). Regardless of this noreturn attribute, I was thinking whether we should always emit an extra instruction so that any IP (say one past last instruction) we get for a BPF prog can always be seen as belonging to it. It probably is only a problem surfaced by this bpf_throw call at the end, but I was wondering whether doing it unconditionally makes sense.
On Mon, Jul 17, 2023 at 9:36 AM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote: > > On Sat, 15 Jul 2023 at 04:09, Alexei Starovoitov > <alexei.starovoitov@gmail.com> wrote: > > > > On Thu, Jul 13, 2023 at 08:02:29AM +0530, Kumar Kartikeya Dwivedi wrote: > > > Now that we allow exception throwing using bpf_throw kfunc, it can > > > appear as the final instruction in a prog. When this happens, and we > > > begin to unwind the stack using arch_bpf_stack_walk, the instruction > > > pointer (IP) may appear to lie outside the JITed instructions. This > > > happens because the return address is the instruction following the > > > call, but the bpf_throw never returns to the program, so the JIT > > > considers instruction ending at the bpf_throw call as the final JITed > > > instruction and end of the jited_length for the program. > > > > > > This becomes a problem when we search the IP using is_bpf_text_address > > > and bpf_prog_ksym_find, both of which use bpf_ksym_find under the hood, > > > and it rightfully considers addr == ksym.end to be outside the program's > > > boundaries. > > > > > > Insert a dummy 'int3' instruction which will never be hit to bump the > > > jited_length and allow us to handle programs with their final > > > isntruction being a call to bpf_throw. > > > > > > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> > > > --- > > > arch/x86/net/bpf_jit_comp.c | 11 +++++++++++ > > > include/linux/bpf.h | 2 ++ > > > 2 files changed, 13 insertions(+) > > > > > > diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c > > > index 8d97c6a60f9a..052230cc7f50 100644 > > > --- a/arch/x86/net/bpf_jit_comp.c > > > +++ b/arch/x86/net/bpf_jit_comp.c > > > @@ -1579,6 +1579,17 @@ st: if (is_imm8(insn->off)) > > > } > > > if (emit_call(&prog, func, image + addrs[i - 1] + offs)) > > > return -EINVAL; > > > + /* Similar to BPF_EXIT_INSN, call for bpf_throw may be > > > + * the final instruction in the program. Insert an int3 > > > + * following the call instruction so that we can still > > > + * detect pc to be part of the bpf_prog in > > > + * bpf_ksym_find, otherwise when this is the last > > > + * instruction (as allowed by verifier, similar to exit > > > + * and jump instructions), pc will be == ksym.end, > > > + * leading to bpf_throw failing to unwind the stack. > > > + */ > > > + if (func == (u8 *)&bpf_throw) > > > + EMIT1(0xCC); /* int3 */ > > > > Probably worth explaining that this happens because bpf_throw is marked > > __attribute__((noreturn)) and compiler can emit it last without BPF_EXIT insn. > > Meaing the program might not have BPF_EXIT at all. > > Yes, sorry about omitting that. I will add it to the commit message in v2. > > > > > I wonder though whether this self-inflicted pain is worth it. > > May be it shouldn't be marked as noreturn. > > What do we gain by marking? > > It felt like the obvious thing to do to me. The cost on the kernel > side is negligible (atleast in my opinion), we just have to allow it > as final instruction in the program. If it's heavily used it allows > the compiler to better optimize the code (marking anything after it > unreachable, no need to save registers etc., although this may not be > a persuasive point for you). "no need to save registers"... "optimize"... that's the thing that worries me. I think it's better to drop noreturn attribute. bpf has implicit prolog/epilogue that only apply to bpf_exit insn. bpf_call insn that doesn't return is exploiting undefined logic in the compiler, since we never fully clarified our hidden prologue/epilogue rules. Saying it differently, bpf_tail_call is also noreturn, but if we mark it as such all kinds of things will break. We still need to add alloca(). It doesn't play well with the current BPF ISA. I think it's better to treat 'noreturn' as broken in the compiler, since its behavior may change. > Regardless of this noreturn attribute, I was thinking whether we > should always emit an extra instruction so that any IP (say one past > last instruction) we get for a BPF prog can always be seen as > belonging to it. It probably is only a problem surfaced by this > bpf_throw call at the end, but I was wondering whether doing it > unconditionally makes sense. I think it's a corner case of this 'noreturn' from bpf_call logic. The bpf prog is padded with 0xcc before and after already. What you're suggesting is to add one of 0xcc to the body of the prog. That doesn't sound right.
On Mon, 17 Jul 2023 at 23:15, Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > On Mon, Jul 17, 2023 at 9:36 AM Kumar Kartikeya Dwivedi > <memxor@gmail.com> wrote: > > > > On Sat, 15 Jul 2023 at 04:09, Alexei Starovoitov > > <alexei.starovoitov@gmail.com> wrote: > > > > > > On Thu, Jul 13, 2023 at 08:02:29AM +0530, Kumar Kartikeya Dwivedi wrote: > > > > Now that we allow exception throwing using bpf_throw kfunc, it can > > > > appear as the final instruction in a prog. When this happens, and we > > > > begin to unwind the stack using arch_bpf_stack_walk, the instruction > > > > pointer (IP) may appear to lie outside the JITed instructions. This > > > > happens because the return address is the instruction following the > > > > call, but the bpf_throw never returns to the program, so the JIT > > > > considers instruction ending at the bpf_throw call as the final JITed > > > > instruction and end of the jited_length for the program. > > > > > > > > This becomes a problem when we search the IP using is_bpf_text_address > > > > and bpf_prog_ksym_find, both of which use bpf_ksym_find under the hood, > > > > and it rightfully considers addr == ksym.end to be outside the program's > > > > boundaries. > > > > > > > > Insert a dummy 'int3' instruction which will never be hit to bump the > > > > jited_length and allow us to handle programs with their final > > > > isntruction being a call to bpf_throw. > > > > > > > > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> > > > > --- > > > > arch/x86/net/bpf_jit_comp.c | 11 +++++++++++ > > > > include/linux/bpf.h | 2 ++ > > > > 2 files changed, 13 insertions(+) > > > > > > > > diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c > > > > index 8d97c6a60f9a..052230cc7f50 100644 > > > > --- a/arch/x86/net/bpf_jit_comp.c > > > > +++ b/arch/x86/net/bpf_jit_comp.c > > > > @@ -1579,6 +1579,17 @@ st: if (is_imm8(insn->off)) > > > > } > > > > if (emit_call(&prog, func, image + addrs[i - 1] + offs)) > > > > return -EINVAL; > > > > + /* Similar to BPF_EXIT_INSN, call for bpf_throw may be > > > > + * the final instruction in the program. Insert an int3 > > > > + * following the call instruction so that we can still > > > > + * detect pc to be part of the bpf_prog in > > > > + * bpf_ksym_find, otherwise when this is the last > > > > + * instruction (as allowed by verifier, similar to exit > > > > + * and jump instructions), pc will be == ksym.end, > > > > + * leading to bpf_throw failing to unwind the stack. > > > > + */ > > > > + if (func == (u8 *)&bpf_throw) > > > > + EMIT1(0xCC); /* int3 */ > > > > > > Probably worth explaining that this happens because bpf_throw is marked > > > __attribute__((noreturn)) and compiler can emit it last without BPF_EXIT insn. > > > Meaing the program might not have BPF_EXIT at all. > > > > Yes, sorry about omitting that. I will add it to the commit message in v2. > > > > > > > > I wonder though whether this self-inflicted pain is worth it. > > > May be it shouldn't be marked as noreturn. > > > What do we gain by marking? > > > > It felt like the obvious thing to do to me. The cost on the kernel > > side is negligible (atleast in my opinion), we just have to allow it > > as final instruction in the program. If it's heavily used it allows > > the compiler to better optimize the code (marking anything after it > > unreachable, no need to save registers etc., although this may not be > > a persuasive point for you). > > "no need to save registers"... "optimize"... that's the thing that worries me. > I think it's better to drop noreturn attribute. > bpf has implicit prolog/epilogue that only apply to bpf_exit insn. > bpf_call insn that doesn't return is exploiting undefined logic > in the compiler, since we never fully clarified our hidden prologue/epilogue > rules. Saying it differently, bpf_tail_call is also noreturn, > but if we mark it as such all kinds of things will break. > We still need to add alloca(). It doesn't play well with the current BPF ISA. > I think it's better to treat 'noreturn' as broken in the compiler, > since its behavior may change. > Ok, I think then let's drop this patch and the noreturn attribute on bpf_throw. > > Regardless of this noreturn attribute, I was thinking whether we > > should always emit an extra instruction so that any IP (say one past > > last instruction) we get for a BPF prog can always be seen as > > belonging to it. It probably is only a problem surfaced by this > > bpf_throw call at the end, but I was wondering whether doing it > > unconditionally makes sense. > > I think it's a corner case of this 'noreturn' from bpf_call logic. > The bpf prog is padded with 0xcc before and after already. > What you're suggesting is to add one of 0xcc to the body of the prog. > That doesn't sound right. Actually this patch was added because I caught the case where the reported ip during unwinding was lying outside jited_length (== ksym.end). So including an extra instruction would prevent that. But it's true it's a side effect of the noreturn attribute, it wouldn't occur otherwise. Let's drop this patch then.
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c index 8d97c6a60f9a..052230cc7f50 100644 --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -1579,6 +1579,17 @@ st: if (is_imm8(insn->off)) } if (emit_call(&prog, func, image + addrs[i - 1] + offs)) return -EINVAL; + /* Similar to BPF_EXIT_INSN, call for bpf_throw may be + * the final instruction in the program. Insert an int3 + * following the call instruction so that we can still + * detect pc to be part of the bpf_prog in + * bpf_ksym_find, otherwise when this is the last + * instruction (as allowed by verifier, similar to exit + * and jump instructions), pc will be == ksym.end, + * leading to bpf_throw failing to unwind the stack. + */ + if (func == (u8 *)&bpf_throw) + EMIT1(0xCC); /* int3 */ break; } diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 61cdb291311f..1652d184ee7f 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -3111,4 +3111,6 @@ static inline gfp_t bpf_memcg_flags(gfp_t flags) return flags; } +extern void bpf_throw(u64); + #endif /* _LINUX_BPF_H */
Now that we allow exception throwing using bpf_throw kfunc, it can appear as the final instruction in a prog. When this happens, and we begin to unwind the stack using arch_bpf_stack_walk, the instruction pointer (IP) may appear to lie outside the JITed instructions. This happens because the return address is the instruction following the call, but the bpf_throw never returns to the program, so the JIT considers instruction ending at the bpf_throw call as the final JITed instruction and end of the jited_length for the program. This becomes a problem when we search the IP using is_bpf_text_address and bpf_prog_ksym_find, both of which use bpf_ksym_find under the hood, and it rightfully considers addr == ksym.end to be outside the program's boundaries. Insert a dummy 'int3' instruction which will never be hit to bump the jited_length and allow us to handle programs with their final isntruction being a call to bpf_throw. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> --- arch/x86/net/bpf_jit_comp.c | 11 +++++++++++ include/linux/bpf.h | 2 ++ 2 files changed, 13 insertions(+)