Message ID | 20230912233214.1518551-11-memxor@gmail.com (mailing list archive) |
---|---|
State | Accepted |
Commit | ec5290a178b787b2f8b21581fdadc919bd004e12 |
Delegated to: | BPF |
Headers | show |
Series | Exceptions - 1/2 | expand |
On Wed, Sep 13, 2023 at 1:32 AM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote: > > The KASAN stack instrumentation when CONFIG_KASAN_STACK is true poisons > the stack of a function when it is entered and unpoisons it when > leaving. However, in the case of bpf_throw, we will never return as we > switch our stack frame to the BPF exception callback. Later, this > discrepancy will lead to confusing KASAN splats when kernel resumes > execution on return from the BPF program. > > Fix this by unpoisoning everything below the stack pointer of the BPF > program, which should cover the range that would not be unpoisoned. An > example splat is below: > > BUG: KASAN: stack-out-of-bounds in stack_trace_consume_entry+0x14e/0x170 > Write of size 8 at addr ffffc900013af958 by task test_progs/227 > > CPU: 0 PID: 227 Comm: test_progs Not tainted 6.5.0-rc2-g43f1c6c9052a-dirty #26 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-2.fc39 04/01/2014 > Call Trace: > <TASK> > dump_stack_lvl+0x4a/0x80 > print_report+0xcf/0x670 > ? arch_stack_walk+0x79/0x100 > kasan_report+0xda/0x110 > ? stack_trace_consume_entry+0x14e/0x170 > ? stack_trace_consume_entry+0x14e/0x170 > ? __pfx_stack_trace_consume_entry+0x10/0x10 > stack_trace_consume_entry+0x14e/0x170 > ? __sys_bpf+0xf2e/0x41b0 > arch_stack_walk+0x8b/0x100 > ? __sys_bpf+0xf2e/0x41b0 > ? bpf_prog_test_run_skb+0x341/0x1c70 > ? bpf_prog_test_run_skb+0x341/0x1c70 > stack_trace_save+0x9b/0xd0 > ? __pfx_stack_trace_save+0x10/0x10 > ? __kasan_slab_free+0x109/0x180 > ? bpf_prog_test_run_skb+0x341/0x1c70 > ? __sys_bpf+0xf2e/0x41b0 > ? __x64_sys_bpf+0x78/0xc0 > ? do_syscall_64+0x3c/0x90 > ? entry_SYSCALL_64_after_hwframe+0x6e/0xd8 > kasan_save_stack+0x33/0x60 > ? kasan_save_stack+0x33/0x60 > ? kasan_set_track+0x25/0x30 > ? kasan_save_free_info+0x2b/0x50 > ? __kasan_slab_free+0x109/0x180 > ? kmem_cache_free+0x191/0x460 > ? bpf_prog_test_run_skb+0x341/0x1c70 > kasan_set_track+0x25/0x30 > kasan_save_free_info+0x2b/0x50 > __kasan_slab_free+0x109/0x180 > kmem_cache_free+0x191/0x460 > bpf_prog_test_run_skb+0x341/0x1c70 > ? __pfx_bpf_prog_test_run_skb+0x10/0x10 > ? __fget_light+0x51/0x220 > __sys_bpf+0xf2e/0x41b0 > ? __might_fault+0xa2/0x170 > ? __pfx___sys_bpf+0x10/0x10 > ? lock_release+0x1de/0x620 > ? __might_fault+0xcd/0x170 > ? __pfx_lock_release+0x10/0x10 > ? __pfx_blkcg_maybe_throttle_current+0x10/0x10 > __x64_sys_bpf+0x78/0xc0 > ? syscall_enter_from_user_mode+0x20/0x50 > do_syscall_64+0x3c/0x90 > entry_SYSCALL_64_after_hwframe+0x6e/0xd8 > RIP: 0033:0x7f0fbb38880d > Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d > 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d f3 45 12 00 f7 d8 64 > 89 01 48 > RSP: 002b:00007ffe13907de8 EFLAGS: 00000206 ORIG_RAX: 0000000000000141 > RAX: ffffffffffffffda RBX: 00007ffe13908708 RCX: 00007f0fbb38880d > RDX: 0000000000000050 RSI: 00007ffe13907e20 RDI: 000000000000000a > RBP: 00007ffe13907e00 R08: 0000000000000000 R09: 00007ffe13907e20 > R10: 0000000000000064 R11: 0000000000000206 R12: 0000000000000003 > R13: 0000000000000000 R14: 00007f0fbb532000 R15: 0000000000cfbd90 > </TASK> > > The buggy address belongs to stack of task test_progs/227 > KASAN internal error: frame info validation failed; invalid marker: 0 > > The buggy address belongs to the virtual mapping at > [ffffc900013a8000, ffffc900013b1000) created by: > kernel_clone+0xcd/0x600 > > The buggy address belongs to the physical page: > page:00000000b70f4332 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x11418f > flags: 0x2fffe0000000000(node=0|zone=2|lastcpupid=0x7fff) > page_type: 0xffffffff() > raw: 02fffe0000000000 0000000000000000 dead000000000122 0000000000000000 > raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000 > page dumped because: kasan: bad access detected > > Memory state around the buggy address: > ffffc900013af800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ffffc900013af880: 00 00 00 f1 f1 f1 f1 00 00 00 f3 f3 f3 f3 f3 00 > >ffffc900013af900: 00 00 00 00 00 00 00 00 00 00 00 f1 00 00 00 00 > ^ > ffffc900013af980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ffffc900013afa00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ================================================================== > Disabling lock debugging due to kernel taint > > Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> > Cc: Alexander Potapenko <glider@google.com> > Cc: Andrey Konovalov <andreyknvl@gmail.com> > Cc: Dmitry Vyukov <dvyukov@google.com> > Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> > --- > kernel/bpf/helpers.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c > index 78e8f4de6750..2c8e1ee97b71 100644 > --- a/kernel/bpf/helpers.c > +++ b/kernel/bpf/helpers.c > @@ -22,6 +22,7 @@ > #include <linux/security.h> > #include <linux/btf_ids.h> > #include <linux/bpf_mem_alloc.h> > +#include <linux/kasan.h> > > #include "../../lib/kstrtox.h" > > @@ -2483,6 +2484,11 @@ __bpf_kfunc void bpf_throw(u64 cookie) > WARN_ON_ONCE(!ctx.aux->exception_boundary); > WARN_ON_ONCE(!ctx.bp); > WARN_ON_ONCE(!ctx.cnt); > + /* Prevent KASAN false positives for CONFIG_KASAN_STACK by unpoisoning > + * deeper stack depths than ctx.sp as we do not return from bpf_throw, > + * which skips compiler generated instrumentation to do the same. > + */ > + kasan_unpoison_task_stack_below((void *)ctx.sp); > ctx.aux->bpf_exception_cb(cookie, ctx.sp, ctx.bp); > } > > -- > 2.41.0 > Acked-by: Andrey Konovalov <andreyknvl@gmail.com>
Hi Kumar, (+ netdev in Cc as this patch is now in net-next tree as well ; same for mptcp-next) On 13/09/2023 01:32, Kumar Kartikeya Dwivedi wrote: > The KASAN stack instrumentation when CONFIG_KASAN_STACK is true poisons > the stack of a function when it is entered and unpoisons it when > leaving. However, in the case of bpf_throw, we will never return as we > switch our stack frame to the BPF exception callback. Later, this > discrepancy will lead to confusing KASAN splats when kernel resumes > execution on return from the BPF program. > > Fix this by unpoisoning everything below the stack pointer of the BPF > program, which should cover the range that would not be unpoisoned. An > example splat is below: Thank you for your patch! (...) > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c > index 78e8f4de6750..2c8e1ee97b71 100644 > --- a/kernel/bpf/helpers.c > +++ b/kernel/bpf/helpers.c > @@ -22,6 +22,7 @@ > #include <linux/security.h> > #include <linux/btf_ids.h> > #include <linux/bpf_mem_alloc.h> > +#include <linux/kasan.h> > > #include "../../lib/kstrtox.h" > > @@ -2483,6 +2484,11 @@ __bpf_kfunc void bpf_throw(u64 cookie) > WARN_ON_ONCE(!ctx.aux->exception_boundary); > WARN_ON_ONCE(!ctx.bp); > WARN_ON_ONCE(!ctx.cnt); > + /* Prevent KASAN false positives for CONFIG_KASAN_STACK by unpoisoning > + * deeper stack depths than ctx.sp as we do not return from bpf_throw, > + * which skips compiler generated instrumentation to do the same. > + */ > + kasan_unpoison_task_stack_below((void *)ctx.sp); Our CI validating MPTCP tree has just reported the following error when building the kernel for a 32-bit architecture: kernel/bpf/helpers.c: In function 'bpf_throw': kernel/bpf/helpers.c:2491:41: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast] 2491 | kasan_unpoison_task_stack_below((void *)ctx.sp); | ^ cc1: all warnings being treated as errors Source: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/6221288400/job/16882945173 It looks like this issue has been introduced by your patch. Are you already looking at a fix? > ctx.aux->bpf_exception_cb(cookie, ctx.sp, ctx.bp); > } > Cheers, Matt
On Mon, 18 Sept 2023 at 15:20, Matthieu Baerts <matthieu.baerts@tessares.net> wrote: > > Hi Kumar, > > (+ netdev in Cc as this patch is now in net-next tree as well ; same for > mptcp-next) > > > On 13/09/2023 01:32, Kumar Kartikeya Dwivedi wrote: > > The KASAN stack instrumentation when CONFIG_KASAN_STACK is true poisons > > the stack of a function when it is entered and unpoisons it when > > leaving. However, in the case of bpf_throw, we will never return as we > > switch our stack frame to the BPF exception callback. Later, this > > discrepancy will lead to confusing KASAN splats when kernel resumes > > execution on return from the BPF program. > > > > Fix this by unpoisoning everything below the stack pointer of the BPF > > program, which should cover the range that would not be unpoisoned. An > > example splat is below: > > Thank you for your patch! > > (...) > > > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c > > index 78e8f4de6750..2c8e1ee97b71 100644 > > --- a/kernel/bpf/helpers.c > > +++ b/kernel/bpf/helpers.c > > @@ -22,6 +22,7 @@ > > #include <linux/security.h> > > #include <linux/btf_ids.h> > > #include <linux/bpf_mem_alloc.h> > > +#include <linux/kasan.h> > > > > #include "../../lib/kstrtox.h" > > > > @@ -2483,6 +2484,11 @@ __bpf_kfunc void bpf_throw(u64 cookie) > > WARN_ON_ONCE(!ctx.aux->exception_boundary); > > WARN_ON_ONCE(!ctx.bp); > > WARN_ON_ONCE(!ctx.cnt); > > + /* Prevent KASAN false positives for CONFIG_KASAN_STACK by unpoisoning > > + * deeper stack depths than ctx.sp as we do not return from bpf_throw, > > + * which skips compiler generated instrumentation to do the same. > > + */ > > + kasan_unpoison_task_stack_below((void *)ctx.sp); > > Our CI validating MPTCP tree has just reported the following error when > building the kernel for a 32-bit architecture: > > kernel/bpf/helpers.c: In function 'bpf_throw': > kernel/bpf/helpers.c:2491:41: error: cast to pointer from integer of > different size [-Werror=int-to-pointer-cast] > 2491 | kasan_unpoison_task_stack_below((void *)ctx.sp); > | ^ > cc1: all warnings being treated as errors > > Source: > https://github.com/multipath-tcp/mptcp_net-next/actions/runs/6221288400/job/16882945173 > > > It looks like this issue has been introduced by your patch. Are you > already looking at a fix? > Yes, my patch is responsible. So pointers here are 32-bits, while ctx.sp is 64-bit, hence it is complaining. I think long is supposed to match pointer width on all targets Linux supports, so doing this should fix it. (void*)(long)ctx.sp I will send a fix for this soon. Thanks > > ctx.aux->bpf_exception_cb(cookie, ctx.sp, ctx.bp); > > } > > > > Cheers, > Matt > -- > Tessares | Belgium | Hybrid Access Solutions > www.tessares.net
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c index 78e8f4de6750..2c8e1ee97b71 100644 --- a/kernel/bpf/helpers.c +++ b/kernel/bpf/helpers.c @@ -22,6 +22,7 @@ #include <linux/security.h> #include <linux/btf_ids.h> #include <linux/bpf_mem_alloc.h> +#include <linux/kasan.h> #include "../../lib/kstrtox.h" @@ -2483,6 +2484,11 @@ __bpf_kfunc void bpf_throw(u64 cookie) WARN_ON_ONCE(!ctx.aux->exception_boundary); WARN_ON_ONCE(!ctx.bp); WARN_ON_ONCE(!ctx.cnt); + /* Prevent KASAN false positives for CONFIG_KASAN_STACK by unpoisoning + * deeper stack depths than ctx.sp as we do not return from bpf_throw, + * which skips compiler generated instrumentation to do the same. + */ + kasan_unpoison_task_stack_below((void *)ctx.sp); ctx.aux->bpf_exception_cb(cookie, ctx.sp, ctx.bp); }