Message ID | 20240505160628.2323363-17-rppt@kernel.org (mailing list archive) |
---|---|
State | Awaiting Upstream |
Headers | show |
Series | mm: jit/text allocator | expand |
Hi, On 2024-05-05 18:06, Mike Rapoport wrote: > From: "Mike Rapoport (IBM)" <rppt@kernel.org> > > BPF just-in-time compiler depended on CONFIG_MODULES because it used > module_alloc() to allocate memory for the generated code. > > Since code allocations are now implemented with execmem, drop dependency of > CONFIG_BPF_JIT on CONFIG_MODULES and make it select CONFIG_EXECMEM. > > Suggested-by: Björn Töpel <bjorn@kernel.org> > Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org> > --- > kernel/bpf/Kconfig | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/kernel/bpf/Kconfig b/kernel/bpf/Kconfig > index bc25f5098a25..f999e4e0b344 100644 > --- a/kernel/bpf/Kconfig > +++ b/kernel/bpf/Kconfig > @@ -43,7 +43,7 @@ config BPF_JIT > bool "Enable BPF Just In Time compiler" > depends on BPF > depends on HAVE_CBPF_JIT || HAVE_EBPF_JIT > - depends on MODULES > + select EXECMEM > help > BPF programs are normally handled by a BPF interpreter. This option > allows the kernel to generate native code when a program is loaded This does not seem to work entirely. If build with BPF_JIT without module support for my Raspberry Pi 3 B I get warnings in my kernel log (easiest way to trigger it seems to be trying to ssh into it, which fails). Kind regards, Klara Modin ldrop login: [ 43.741638] Internal error: BRK handler: 00000000f2000100 [#1] SMP [ 43.749269] CPU: 3 PID: 2083 Comm: sshd Not tainted 6.9.0-01786-g2c9e5d4a0082 #25 [ 43.758216] Hardware name: Raspberry Pi 3 Model B (DT) [ 43.764769] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 43.773199] pc : 0xffff8000814dd8b8 [ 43.778084] lr : __seccomp_filter (include/linux/bpf.h:1234 include/linux/filter.h:657 include/linux/filter.h:664 include/linux/filter.h:681 kernel/seccomp.c:426 kernel/seccomp.c:1222) [ 43.783784] sp : ffff8000855a3d40 [ 43.788471] x29: ffff8000855a3d90 x28: 0000000000000000 x27: 0000000000000001 [ 43.797082] x26: 00000000000000de x25: 0000000000000000 x24: 000000007fff0000 [ 43.805652] x23: 0000000080000000 x22: ffff8000855a3d48 x21: ffff000005446480 [ 43.814189] x20: ffff0000046ad300 x19: ffff80008147d000 x18: 0000000000000000 [ 43.822694] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 [ 43.831160] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000 [ 43.839577] x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000000 [ 43.847966] x8 : 0000000000000000 x7 : 0000000000001000 x6 : 0000000000000022 [ 43.856311] x5 : 0000000000000003 x4 : 0000000000000000 x3 : 0000000000000001 [ 43.864636] x2 : ffff8000814dd8b8 x1 : ffff80008147d048 x0 : ffff8000855a3d48 [ 43.872958] Call trace: [ 43.876450] 0xffff8000814dd8b8 [ 43.880610] __secure_computing (kernel/seccomp.c:1363) [ 43.885622] syscall_trace_enter (arch/arm64/kernel/ptrace.c:2242 (discriminator 1)) [ 43.890826] el0_svc_common.constprop.0 (arch/arm64/kernel/syscall.c:128) [ 43.896593] do_el0_svc (arch/arm64/kernel/syscall.c:153) [ 43.900909] el0_svc (arch/arm64/include/asm/irqflags.h:56 arch/arm64/include/asm/irqflags.h:77 arch/arm64/kernel/entry-common.c:165 arch/arm64/kernel/entry-common.c:178 arch/arm64/kernel/entry-common.c:713) [ 43.904922] el0t_64_sync_handler (arch/arm64/kernel/entry-common.c:731) [ 43.910232] el0t_64_sync (arch/arm64/kernel/entry.S:598) [ 43.914795] Code: d4202000 d4202000 d4202000 d4202000 (d4202000) All code ======== 0:* 00 20 add %ah,(%rax) <-- trapping instruction 2: 20 d4 and %dl,%ah 4: 00 20 add %ah,(%rax) 6: 20 d4 and %dl,%ah 8: 00 20 add %ah,(%rax) a: 20 d4 and %dl,%ah c: 00 20 add %ah,(%rax) e: 20 d4 and %dl,%ah 10: 00 20 add %ah,(%rax) 12: 20 d4 and %dl,%ah Code starting with the faulting instruction =========================================== 0: 00 20 add %ah,(%rax) 2: 20 d4 and %dl,%ah [ 43.921826] ---[ end trace 0000000000000000 ]--- [ 43.927335] note: sshd[2083] exited with irqs disabled [ 43.933417] note: sshd[2083] exited with preempt_count 1 [ 43.934685] ------------[ cut here ]------------ [ 43.945156] WARNING: CPU: 3 PID: 0 at kernel/context_tracking.c:128 ct_kernel_exit.constprop.0 (kernel/context_tracking.c:128 (discriminator 1)) [ 43.956500] CPU: 3 PID: 0 Comm: swapper/3 Tainted: G D 6.9.0-01786-g2c9e5d4a0082 #25 [ 43.967570] Hardware name: Raspberry Pi 3 Model B (DT) [ 43.973716] pstate: 200003c5 (nzCv DAIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 43.981774] pc : ct_kernel_exit.constprop.0 (kernel/context_tracking.c:128 (discriminator 1)) [ 43.987920] lr : ct_idle_enter (kernel/context_tracking.c:321) [ 43.992926] sp : ffff80008144bdd0 [ 43.997312] x29: ffff80008144bdd0 x28: ffff000002061100 x27: 0000000000000000 [ 44.005623] x26: ffff80008154bde0 x25: ffff000001a590c0 x24: 0000000000000000 [ 44.013899] x23: 0000000000000000 x22: ffff000001a590c0 x21: ffff80008118ad28 [ 44.022186] x20: ffff80008118ac08 x19: ffff00003a1bd610 x18: ffff8000855a3878 [ 44.030483] x17: ffffffffffffffff x16: 0000000000000000 x15: 0000ffffbbbce000 [ 44.038794] x14: 04d1d6f476a588c8 x13: 00000000000003bb x12: 0000000000000001 [ 44.047107] x11: 0000000000000001 x10: 0000000000000a00 x9 : ffff80008144bd30 [ 44.055426] x8 : ffff000001a59b20 x7 : 0000000000000000 x6 : 000000003ad2e995 [ 44.063758] x5 : 4000000000000002 x4 : ffff7fffb91c3000 x3 : ffff80008144bdd0 [ 44.072105] x2 : 4000000000000000 x1 : ffff800080ffa610 x0 : ffff800080ffa610 [ 44.080461] Call trace: [ 44.084011] ct_kernel_exit.constprop.0 (kernel/context_tracking.c:128 (discriminator 1)) [ 44.089869] ct_idle_enter (kernel/context_tracking.c:321) [ 44.094563] default_idle_call (kernel/sched/idle.c:117) [ 44.099622] do_idle (kernel/sched/idle.c:192 kernel/sched/idle.c:332) [ 44.103959] cpu_startup_entry (kernel/sched/idle.c:429) [ 44.108970] secondary_start_kernel (arch/arm64/include/asm/atomic_ll_sc.h:95 (discriminator 2) arch/arm64/include/asm/atomic.h:28 (discriminator 2) include/linux/atomic/atomic-arch-fallback.h:546 (discriminator 2) include/linux/atomic/atomic-arch-fallback.h:994 (discriminator 2) include/linux/atomic/atomic-instrumented.h:436 (discriminator 2) include/linux/sched/mm.h:36 (discriminator 2) arch/arm64/kernel/smp.c:214 (discriminator 2)) [ 44.114569] __secondary_switched (arch/arm64/kernel/head.S:418) [ 44.119783] ---[ end trace 0000000000000000 ]--- [ 44.125645] ------------[ cut here ]------------ [ 44.131296] Trying to vfree() bad address (000000004a17c299) [ 44.138024] WARNING: CPU: 1 PID: 193 at mm/vmalloc.c:3189 remove_vm_area (mm/vmalloc.c:3189 (discriminator 1)) [ 44.146675] CPU: 1 PID: 193 Comm: kworker/1:2 Tainted: G D W 6.9.0-01786-g2c9e5d4a0082 #25 [ 44.158229] Hardware name: Raspberry Pi 3 Model B (DT) [ 44.164433] Workqueue: events bpf_prog_free_deferred [ 44.170492] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 44.178601] pc : remove_vm_area (mm/vmalloc.c:3189 (discriminator 1)) [ 44.183705] lr : remove_vm_area (mm/vmalloc.c:3189 (discriminator 1)) [ 44.188772] sp : ffff800082a13c70 [ 44.193112] x29: ffff800082a13c70 x28: 0000000000000000 x27: 0000000000000000 [ 44.201384] x26: 0000000000000000 x25: ffff00003a44efa0 x24: 00000000d4202000 [ 44.209658] x23: ffff800081223dd0 x22: ffff00003a198a40 x21: ffff8000814dd880 [ 44.217924] x20: 00000000d4202000 x19: ffff8000814dd880 x18: 0000000000000006 [ 44.226206] x17: 0000000000000000 x16: 0000000000000020 x15: 0000000000000002 [ 44.234460] x14: ffff8000811a6370 x13: 0000000020000000 x12: 0000000000000000 [ 44.242710] x11: ffff8000811a6370 x10: 0000000000000144 x9 : ffff8000811fe370 [ 44.250959] x8 : 0000000000017fe8 x7 : 00000000fffff000 x6 : ffff8000811fe370 [ 44.259206] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000000 [ 44.267457] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000002203240 [ 44.275703] Call trace: [ 44.279158] remove_vm_area (mm/vmalloc.c:3189 (discriminator 1)) [ 44.283858] vfree (mm/vmalloc.c:3322) [ 44.287835] execmem_free (mm/execmem.c:70) [ 44.292347] bpf_jit_free_exec+0x10/0x1c [ 44.297283] bpf_prog_pack_free (kernel/bpf/core.c:1006) [ 44.302457] bpf_jit_binary_pack_free (kernel/bpf/core.c:1195) [ 44.307951] bpf_jit_free (include/linux/filter.h:1083 arch/arm64/net/bpf_jit_comp.c:2474) [ 44.312342] bpf_prog_free_deferred (kernel/bpf/core.c:2785) [ 44.317785] process_one_work (kernel/workqueue.c:3273) [ 44.322684] worker_thread (kernel/workqueue.c:3342 (discriminator 2) kernel/workqueue.c:3429 (discriminator 2)) [ 44.327292] kthread (kernel/kthread.c:388) [ 44.331342] ret_from_fork (arch/arm64/kernel/entry.S:861) [ 44.335758] ---[ end trace 0000000000000000 ]--- [ 44.341288] ------------[ cut here ]------------ [ 44.346777] Trying to vfree() nonexistent vm area (000000004a17c299) [ 44.354077] WARNING: CPU: 1 PID: 193 at mm/vmalloc.c:3324 vfree (mm/vmalloc.c:3324 (discriminator 1)) [ 44.361988] CPU: 1 PID: 193 Comm: kworker/1:2 Tainted: G D W 6.9.0-01786-g2c9e5d4a0082 #25 [ 44.373301] Hardware name: Raspberry Pi 3 Model B (DT) [ 44.379397] Workqueue: events bpf_prog_free_deferred [ 44.385342] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 44.393343] pc : vfree (mm/vmalloc.c:3324 (discriminator 1)) [ 44.397723] lr : vfree (mm/vmalloc.c:3324 (discriminator 1)) [ 44.402088] sp : ffff800082a13c90 [ 44.406326] x29: ffff800082a13c90 x28: 0000000000000000 x27: 0000000000000000 [ 44.414509] x26: 0000000000000000 x25: ffff00003a44efa0 x24: 00000000d4202000 [ 44.422704] x23: ffff800081223dd0 x22: ffff00003a198a40 x21: 0000000000000000 [ 44.430908] x20: 00000000d4202000 x19: ffff8000814dd880 x18: 0000000000000006 [ 44.439122] x17: 0000000000000000 x16: 0000000000000020 x15: 0000000000000002 [ 44.447338] x14: ffff8000811a6370 x13: 0000000020000000 x12: 0000000000000000 [ 44.455553] x11: ffff8000811a6370 x10: 0000000000000166 x9 : ffff8000811fe370 [ 44.463771] x8 : 0000000000017fe8 x7 : 00000000fffff000 x6 : ffff8000811fe370 [ 44.471989] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000000 [ 44.480208] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000002203240 [ 44.488420] Call trace: [ 44.491847] vfree (mm/vmalloc.c:3324 (discriminator 1)) [ 44.495900] execmem_free (mm/execmem.c:70) [ 44.500394] bpf_jit_free_exec+0x10/0x1c [ 44.505329] bpf_prog_pack_free (kernel/bpf/core.c:1006) [ 44.510507] bpf_jit_binary_pack_free (kernel/bpf/core.c:1195) [ 44.516017] bpf_jit_free (include/linux/filter.h:1083 arch/arm64/net/bpf_jit_comp.c:2474) [ 44.520424] bpf_prog_free_deferred (kernel/bpf/core.c:2785) [ 44.525864] process_one_work (kernel/workqueue.c:3273) [ 44.530754] worker_thread (kernel/workqueue.c:3342 (discriminator 2) kernel/workqueue.c:3429 (discriminator 2)) [ 44.535364] kthread (kernel/kthread.c:388) [ 44.539417] ret_from_fork (arch/arm64/kernel/entry.S:861) [ 44.543791] ---[ end trace 0000000000000000 ]--- # bad: [dbd9e2e056d8577375ae4b31ada94f8aa3769e8a] Add linux-next specific files for 20240516 git bisect start 'next/master' # status: waiting for good commit(s), bad commit known # good: [8c06da67d0bd3139a97f301b4aa9c482b9d4f29e] Merge tag 'livepatching-for-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching git bisect good 8c06da67d0bd3139a97f301b4aa9c482b9d4f29e # good: [147d3734724040bb0aff1252299e48947a6c8858] Merge branch 'master' of git://linuxtv.org/mchehab/media-next.git git bisect good 147d3734724040bb0aff1252299e48947a6c8858 # bad: [729cf96da8de5e7ae70fef40a1b864bc00c2dca1] Merge branch 'next' of git://git.kernel.org/pub/scm/virt/kvm/kvm.git git bisect bad 729cf96da8de5e7ae70fef40a1b864bc00c2dca1 # good: [4364438497c638785b1394aab764a15b6baefaf3] Merge branch 'drm-xe-next' of https://gitlab.freedesktop.org/drm/xe/kernel git bisect good 4364438497c638785b1394aab764a15b6baefaf3 # bad: [b3ead6c10eccbfa446ce30927f94472c278cd3d7] Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git git bisect bad b3ead6c10eccbfa446ce30927f94472c278cd3d7 # bad: [d83384f475a4cfa0e9bda1cab538d99360fa2c48] Merge branch 'for-mfd-next' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd.git git bisect bad d83384f475a4cfa0e9bda1cab538d99360fa2c48 # bad: [9564f97e8e3ec6bdbf0c105b45fa2516d64c4685] Merge branch 'for-next' of git://git.kernel.dk/linux-block.git git bisect bad 9564f97e8e3ec6bdbf0c105b45fa2516d64c4685 # bad: [0e6c77dedcb11f510c0dbdaf6455b918b28f1b62] Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input.git git bisect bad 0e6c77dedcb11f510c0dbdaf6455b918b28f1b62 # good: [5852f2afcdd9b7c9dedec4fdf14b8b079349828f] Input: drop explicit initialization of struct i2c_device_id::driver_data to 0 git bisect good 5852f2afcdd9b7c9dedec4fdf14b8b079349828f # good: [223b5e57d0d50b0c07b933350dbcde92018d3080] mm/execmem, arch: convert remaining overrides of module_alloc to execmem git bisect good 223b5e57d0d50b0c07b933350dbcde92018d3080 # good: [14e56fb2ed1dbc3c3171d12ab435b0f691f6f215] x86/ftrace: enable dynamic ftrace without CONFIG_MODULES git bisect good 14e56fb2ed1dbc3c3171d12ab435b0f691f6f215 # good: [7582b7be16d0ba90e3dbd9575a730cabd9eb852a] kprobes: remove dependency on CONFIG_MODULES git bisect good 7582b7be16d0ba90e3dbd9575a730cabd9eb852a # bad: [86d899efdd58c98a0d196e31945009fc47a56264] Merge branch 'modules-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux.git git bisect bad 86d899efdd58c98a0d196e31945009fc47a56264 # bad: [2c9e5d4a008293407836d29d35dfd4353615bd2f] bpf: remove CONFIG_BPF_JIT dependency on CONFIG_MODULES of git bisect bad 2c9e5d4a008293407836d29d35dfd4353615bd2f # first bad commit: [2c9e5d4a008293407836d29d35dfd4353615bd2f] bpf: remove CONFIG_BPF_JIT dependency on CONFIG_MODULES of
Hi Klara, On Fri, May 17, 2024 at 01:00:31AM +0200, Klara Modin wrote: > On 2024-05-05 18:06, Mike Rapoport wrote: > > From: "Mike Rapoport (IBM)" <rppt@kernel.org> > > > > BPF just-in-time compiler depended on CONFIG_MODULES because it used > > module_alloc() to allocate memory for the generated code. > > > > Since code allocations are now implemented with execmem, drop dependency of > > CONFIG_BPF_JIT on CONFIG_MODULES and make it select CONFIG_EXECMEM. > > > > Suggested-by: Björn Töpel <bjorn@kernel.org> > > Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org> > > --- > > kernel/bpf/Kconfig | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/kernel/bpf/Kconfig b/kernel/bpf/Kconfig > > index bc25f5098a25..f999e4e0b344 100644 > > --- a/kernel/bpf/Kconfig > > +++ b/kernel/bpf/Kconfig > > @@ -43,7 +43,7 @@ config BPF_JIT > > bool "Enable BPF Just In Time compiler" > > depends on BPF > > depends on HAVE_CBPF_JIT || HAVE_EBPF_JIT > > - depends on MODULES > > + select EXECMEM > > help > > BPF programs are normally handled by a BPF interpreter. This option > > allows the kernel to generate native code when a program is loaded > > This does not seem to work entirely. If build with BPF_JIT without module > support for my Raspberry Pi 3 B I get warnings in my kernel log (easiest way > to trigger it seems to be trying to ssh into it, which fails). Thanks for the report. I was able to reproduce this using QEMU and it looks like the problem is because bpf_arch_text_copy() silently fails to write to the read-only area as a result of patch_map() faulting and the resulting -EFAULT being chucked away. Please can you try the diff below? Will --->8 diff --git a/arch/arm64/kernel/patching.c b/arch/arm64/kernel/patching.c index 255534930368..94b9fea65aca 100644 --- a/arch/arm64/kernel/patching.c +++ b/arch/arm64/kernel/patching.c @@ -36,7 +36,7 @@ static void __kprobes *patch_map(void *addr, int fixmap) if (image) page = phys_to_page(__pa_symbol(addr)); - else if (IS_ENABLED(CONFIG_STRICT_MODULE_RWX)) + else if (IS_ENABLED(CONFIG_EXECMEM)) page = vmalloc_to_page(addr); else return addr;
On 2024-05-17 17:46, Will Deacon wrote: > Hi Klara, > > On Fri, May 17, 2024 at 01:00:31AM +0200, Klara Modin wrote: >> >> This does not seem to work entirely. If build with BPF_JIT without module >> support for my Raspberry Pi 3 B I get warnings in my kernel log (easiest way >> to trigger it seems to be trying to ssh into it, which fails). > > Thanks for the report. I was able to reproduce this using QEMU and it > looks like the problem is because bpf_arch_text_copy() silently fails > to write to the read-only area as a result of patch_map() faulting and > the resulting -EFAULT being chucked away. > > Please can you try the diff below? > > Will > > --->8 > > diff --git a/arch/arm64/kernel/patching.c b/arch/arm64/kernel/patching.c > index 255534930368..94b9fea65aca 100644 > --- a/arch/arm64/kernel/patching.c > +++ b/arch/arm64/kernel/patching.c > @@ -36,7 +36,7 @@ static void __kprobes *patch_map(void *addr, int fixmap) > > if (image) > page = phys_to_page(__pa_symbol(addr)); > - else if (IS_ENABLED(CONFIG_STRICT_MODULE_RWX)) > + else if (IS_ENABLED(CONFIG_EXECMEM)) > page = vmalloc_to_page(addr); > else > return addr; > This seems to work from my short testing. Thanks, Tested-by: Klara Modin <klarasmodin@gmail.com>
diff --git a/kernel/bpf/Kconfig b/kernel/bpf/Kconfig index bc25f5098a25..f999e4e0b344 100644 --- a/kernel/bpf/Kconfig +++ b/kernel/bpf/Kconfig @@ -43,7 +43,7 @@ config BPF_JIT bool "Enable BPF Just In Time compiler" depends on BPF depends on HAVE_CBPF_JIT || HAVE_EBPF_JIT - depends on MODULES + select EXECMEM help BPF programs are normally handled by a BPF interpreter. This option allows the kernel to generate native code when a program is loaded