Message ID | 20201026172907.1468294-1-jean-philippe@linaro.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | arm64: Fix early single-stepping | expand |
On Mon, Oct 26, 2020 at 06:29:09PM +0100, Jean-Philippe Brucker wrote: > To use debug features such as single-step, the OS lock must be unlocked > in the debug registers. Currently this is done in postcore_initcall > which is now too late. > > Commit 36dadef23fcc ("kprobes: Init kprobes in early_initcall") enabled > using kprobes from early_initcall, when OS lock is still locked. So when > kprobe attempts to single-step a patched instruction, instead of > trapping, execution continues until it throws an undef exception: > > [ 0.064233] Kprobe smoke test: started > [ 0.151133] ------------[ cut here ]------------ > [ 0.151458] kernel BUG at arch/arm64/kernel/traps.c:406! > [ 0.151812] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP > ... > [ 0.162689] Call trace: > [ 0.163014] do_undefinstr+0x1d4/0x1f4 > [ 0.163336] el1_sync_handler+0xbc/0x140 > [ 0.163839] el1_sync+0x80/0x100 > [ 0.164154] 0xffffffc01001d004 > [ 0.164527] init_kprobes+0x13c/0x154 > [ 0.164968] do_one_initcall+0x54/0x2e0 > [ 0.165322] kernel_init_freeable+0xf4/0x258 > [ 0.165783] kernel_init+0x20/0x12c > [ 0.166117] ret_from_fork+0x10/0x30 > [ 0.166595] Code: 97ffff53 a9425bf5 17ffff9b f9001bf7 (d4210000) > [ 0.167084] ---[ end trace 36778fdf576e9a79 ]--- > > To fix this, unlock the OS lock as early as possible. Do it in > traps_init() for CPU0, since KGDB wants to use single-step from that > point on according to commit b322c65f8ca3 ("arm64: Call > debug_traps_init() from trap_init() to help early kgdb"). > For secondary CPUs, setup the CPU hotplug handler at early_initcall. Hmm, does this mean we end up setting MDSCR_EL1.KDE before we've reset the hardware breakpoint/watchpoint registers? Why do we need kprobes so early? Will
On Mon, 26 Oct 2020 17:38:37 +0000 Will Deacon <will@kernel.org> wrote: > On Mon, Oct 26, 2020 at 06:29:09PM +0100, Jean-Philippe Brucker wrote: > > To use debug features such as single-step, the OS lock must be unlocked > > in the debug registers. Currently this is done in postcore_initcall > > which is now too late. > > > > Commit 36dadef23fcc ("kprobes: Init kprobes in early_initcall") enabled > > using kprobes from early_initcall, when OS lock is still locked. So when > > kprobe attempts to single-step a patched instruction, instead of > > trapping, execution continues until it throws an undef exception: > > > > [ 0.064233] Kprobe smoke test: started > > [ 0.151133] ------------[ cut here ]------------ > > [ 0.151458] kernel BUG at arch/arm64/kernel/traps.c:406! > > [ 0.151812] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP > > ... > > [ 0.162689] Call trace: > > [ 0.163014] do_undefinstr+0x1d4/0x1f4 > > [ 0.163336] el1_sync_handler+0xbc/0x140 > > [ 0.163839] el1_sync+0x80/0x100 > > [ 0.164154] 0xffffffc01001d004 > > [ 0.164527] init_kprobes+0x13c/0x154 > > [ 0.164968] do_one_initcall+0x54/0x2e0 > > [ 0.165322] kernel_init_freeable+0xf4/0x258 > > [ 0.165783] kernel_init+0x20/0x12c > > [ 0.166117] ret_from_fork+0x10/0x30 > > [ 0.166595] Code: 97ffff53 a9425bf5 17ffff9b f9001bf7 (d4210000) > > [ 0.167084] ---[ end trace 36778fdf576e9a79 ]--- > > > > To fix this, unlock the OS lock as early as possible. Do it in > > traps_init() for CPU0, since KGDB wants to use single-step from that > > point on according to commit b322c65f8ca3 ("arm64: Call > > debug_traps_init() from trap_init() to help early kgdb"). > > For secondary CPUs, setup the CPU hotplug handler at early_initcall. Oops, thanks for the fix! I missed it. > Hmm, does this mean we end up setting MDSCR_EL1.KDE before we've reset the > hardware breakpoint/watchpoint registers? Why do we need kprobes so early? This is for boot-time tracing. To enable kprobes events in core_initcall(), we need to enable kprobes itself in early_initcall().(or early_initcall_sync()) With this, we can trace postcore functions with kprobes(it includes some platform initializations), which is making boot-time ftrace more useful. For example, we can trace function-calls in specific code area as I posted an example; https://lore.kernel.org/linux-doc/159887792384.1330989.5993224243767476896.stgit@devnote2/ So this expands the feature to the earlier stages. Thank you,
On Mon, 26 Oct 2020 18:29:09 +0100 Jean-Philippe Brucker <jean-philippe@linaro.org> wrote: > To use debug features such as single-step, the OS lock must be unlocked > in the debug registers. Currently this is done in postcore_initcall > which is now too late. > > Commit 36dadef23fcc ("kprobes: Init kprobes in early_initcall") enabled > using kprobes from early_initcall, when OS lock is still locked. So when > kprobe attempts to single-step a patched instruction, instead of > trapping, execution continues until it throws an undef exception: > > [ 0.064233] Kprobe smoke test: started > [ 0.151133] ------------[ cut here ]------------ > [ 0.151458] kernel BUG at arch/arm64/kernel/traps.c:406! > [ 0.151812] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP > ... > [ 0.162689] Call trace: > [ 0.163014] do_undefinstr+0x1d4/0x1f4 > [ 0.163336] el1_sync_handler+0xbc/0x140 > [ 0.163839] el1_sync+0x80/0x100 > [ 0.164154] 0xffffffc01001d004 > [ 0.164527] init_kprobes+0x13c/0x154 > [ 0.164968] do_one_initcall+0x54/0x2e0 > [ 0.165322] kernel_init_freeable+0xf4/0x258 > [ 0.165783] kernel_init+0x20/0x12c > [ 0.166117] ret_from_fork+0x10/0x30 > [ 0.166595] Code: 97ffff53 a9425bf5 17ffff9b f9001bf7 (d4210000) > [ 0.167084] ---[ end trace 36778fdf576e9a79 ]--- > > To fix this, unlock the OS lock as early as possible. Do it in > traps_init() for CPU0, since KGDB wants to use single-step from that > point on according to commit b322c65f8ca3 ("arm64: Call > debug_traps_init() from trap_init() to help early kgdb"). > For secondary CPUs, setup the CPU hotplug handler at early_initcall. > > Fixes: 36dadef23fcc ("kprobes: Init kprobes in early_initcall") > Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Hi Jean, How have you confirmed this fixes the issue? On my environment, this doesn't fix the issue. We need to call debug_monitors_init() before calling init_kprobe(). But since both are early_initcall, it doesn't guarantee that the debug_monitors_init() is called before init_kprobes(). I think there are 2 ways to fix this (in addition to this patch), - split kprobe selftest from init_kprobe() and call the selftest in core_initcall() - call clear_os_lock(0) from arch_init_kprobes() Thank you, > --- > arch/arm64/kernel/debug-monitors.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/arch/arm64/kernel/debug-monitors.c b/arch/arm64/kernel/debug-monitors.c > index 75a423c3336a..80f082021234 100644 > --- a/arch/arm64/kernel/debug-monitors.c > +++ b/arch/arm64/kernel/debug-monitors.c > @@ -135,7 +135,7 @@ static int __init debug_monitors_init(void) > "arm64/debug_monitors:starting", > clear_os_lock, NULL); > } > -postcore_initcall(debug_monitors_init); > +early_initcall(debug_monitors_init); > > /* > * Single step API and exception handling. > @@ -380,6 +380,7 @@ NOKPROBE_SYMBOL(aarch32_break_handler); > > void __init debug_traps_init(void) > { > + clear_os_lock(0); > hook_debug_fault_code(DBG_ESR_EVT_HWSS, single_step_handler, SIGTRAP, > TRAP_TRACE, "single-step handler"); > hook_debug_fault_code(DBG_ESR_EVT_BRK, brk_handler, SIGTRAP, > -- > 2.29.1 >
On Tue, 27 Oct 2020 19:13:18 +0900 Masami Hiramatsu <mhiramat@kernel.org> wrote: > On Mon, 26 Oct 2020 18:29:09 +0100 > Jean-Philippe Brucker <jean-philippe@linaro.org> wrote: > > > To use debug features such as single-step, the OS lock must be unlocked > > in the debug registers. Currently this is done in postcore_initcall > > which is now too late. > > > > Commit 36dadef23fcc ("kprobes: Init kprobes in early_initcall") enabled > > using kprobes from early_initcall, when OS lock is still locked. So when > > kprobe attempts to single-step a patched instruction, instead of > > trapping, execution continues until it throws an undef exception: > > > > [ 0.064233] Kprobe smoke test: started > > [ 0.151133] ------------[ cut here ]------------ > > [ 0.151458] kernel BUG at arch/arm64/kernel/traps.c:406! > > [ 0.151812] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP > > ... > > [ 0.162689] Call trace: > > [ 0.163014] do_undefinstr+0x1d4/0x1f4 > > [ 0.163336] el1_sync_handler+0xbc/0x140 > > [ 0.163839] el1_sync+0x80/0x100 > > [ 0.164154] 0xffffffc01001d004 > > [ 0.164527] init_kprobes+0x13c/0x154 > > [ 0.164968] do_one_initcall+0x54/0x2e0 > > [ 0.165322] kernel_init_freeable+0xf4/0x258 > > [ 0.165783] kernel_init+0x20/0x12c > > [ 0.166117] ret_from_fork+0x10/0x30 > > [ 0.166595] Code: 97ffff53 a9425bf5 17ffff9b f9001bf7 (d4210000) > > [ 0.167084] ---[ end trace 36778fdf576e9a79 ]--- > > > > To fix this, unlock the OS lock as early as possible. Do it in > > traps_init() for CPU0, since KGDB wants to use single-step from that > > point on according to commit b322c65f8ca3 ("arm64: Call > > debug_traps_init() from trap_init() to help early kgdb"). > > For secondary CPUs, setup the CPU hotplug handler at early_initcall. > > > > Fixes: 36dadef23fcc ("kprobes: Init kprobes in early_initcall") > > Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> > > Hi Jean, > > How have you confirmed this fixes the issue? > On my environment, this doesn't fix the issue. Oops, it was my mistake. I missed to boot up with Xen. (so I find another bug...) Anyway this works for me too. Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Tested-by: Masami Hiramatsu <mhiramat@kernel.org> Thank you!
On Tue, Oct 27, 2020 at 07:42:58PM +0900, Masami Hiramatsu wrote: > On Tue, 27 Oct 2020 19:13:18 +0900 > Masami Hiramatsu <mhiramat@kernel.org> wrote: > > > On Mon, 26 Oct 2020 18:29:09 +0100 > > Jean-Philippe Brucker <jean-philippe@linaro.org> wrote: > > > > > To use debug features such as single-step, the OS lock must be unlocked > > > in the debug registers. Currently this is done in postcore_initcall > > > which is now too late. > > > > > > Commit 36dadef23fcc ("kprobes: Init kprobes in early_initcall") enabled > > > using kprobes from early_initcall, when OS lock is still locked. So when > > > kprobe attempts to single-step a patched instruction, instead of > > > trapping, execution continues until it throws an undef exception: > > > > > > [ 0.064233] Kprobe smoke test: started > > > [ 0.151133] ------------[ cut here ]------------ > > > [ 0.151458] kernel BUG at arch/arm64/kernel/traps.c:406! > > > [ 0.151812] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP > > > ... > > > [ 0.162689] Call trace: > > > [ 0.163014] do_undefinstr+0x1d4/0x1f4 > > > [ 0.163336] el1_sync_handler+0xbc/0x140 > > > [ 0.163839] el1_sync+0x80/0x100 > > > [ 0.164154] 0xffffffc01001d004 > > > [ 0.164527] init_kprobes+0x13c/0x154 > > > [ 0.164968] do_one_initcall+0x54/0x2e0 > > > [ 0.165322] kernel_init_freeable+0xf4/0x258 > > > [ 0.165783] kernel_init+0x20/0x12c > > > [ 0.166117] ret_from_fork+0x10/0x30 > > > [ 0.166595] Code: 97ffff53 a9425bf5 17ffff9b f9001bf7 (d4210000) > > > [ 0.167084] ---[ end trace 36778fdf576e9a79 ]--- > > > > > > To fix this, unlock the OS lock as early as possible. Do it in > > > traps_init() for CPU0, since KGDB wants to use single-step from that > > > point on according to commit b322c65f8ca3 ("arm64: Call > > > debug_traps_init() from trap_init() to help early kgdb"). > > > For secondary CPUs, setup the CPU hotplug handler at early_initcall. > > > > > > Fixes: 36dadef23fcc ("kprobes: Init kprobes in early_initcall") > > > Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> > > > > Hi Jean, > > > > How have you confirmed this fixes the issue? > > On my environment, this doesn't fix the issue. > > Oops, it was my mistake. I missed to boot up with Xen. (so I find another bug...) > Anyway this works for me too. No worries :) Although now I've been wondering whether it would be better to just disable the OS lock lazily, on the first call to enable_debug_monitors(). It might add a tiny performance penalty but would avoid this problem reappearing if one of the debugger needs to start even earlier in the future. > Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> > Tested-by: Masami Hiramatsu <mhiramat@kernel.org> Thanks! Jean
On Tue, Oct 27, 2020 at 12:59:09PM +0100, Jean-Philippe Brucker wrote: > On Tue, Oct 27, 2020 at 07:42:58PM +0900, Masami Hiramatsu wrote: > > On Tue, 27 Oct 2020 19:13:18 +0900 > > Masami Hiramatsu <mhiramat@kernel.org> wrote: > > > > > On Mon, 26 Oct 2020 18:29:09 +0100 > > > Jean-Philippe Brucker <jean-philippe@linaro.org> wrote: > > > > > > > To use debug features such as single-step, the OS lock must be unlocked > > > > in the debug registers. Currently this is done in postcore_initcall > > > > which is now too late. > > > > > > > > Commit 36dadef23fcc ("kprobes: Init kprobes in early_initcall") enabled > > > > using kprobes from early_initcall, when OS lock is still locked. So when > > > > kprobe attempts to single-step a patched instruction, instead of > > > > trapping, execution continues until it throws an undef exception: > > > > > > > > [ 0.064233] Kprobe smoke test: started > > > > [ 0.151133] ------------[ cut here ]------------ > > > > [ 0.151458] kernel BUG at arch/arm64/kernel/traps.c:406! > > > > [ 0.151812] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP > > > > ... > > > > [ 0.162689] Call trace: > > > > [ 0.163014] do_undefinstr+0x1d4/0x1f4 > > > > [ 0.163336] el1_sync_handler+0xbc/0x140 > > > > [ 0.163839] el1_sync+0x80/0x100 > > > > [ 0.164154] 0xffffffc01001d004 > > > > [ 0.164527] init_kprobes+0x13c/0x154 > > > > [ 0.164968] do_one_initcall+0x54/0x2e0 > > > > [ 0.165322] kernel_init_freeable+0xf4/0x258 > > > > [ 0.165783] kernel_init+0x20/0x12c > > > > [ 0.166117] ret_from_fork+0x10/0x30 > > > > [ 0.166595] Code: 97ffff53 a9425bf5 17ffff9b f9001bf7 (d4210000) > > > > [ 0.167084] ---[ end trace 36778fdf576e9a79 ]--- > > > > > > > > To fix this, unlock the OS lock as early as possible. Do it in > > > > traps_init() for CPU0, since KGDB wants to use single-step from that > > > > point on according to commit b322c65f8ca3 ("arm64: Call > > > > debug_traps_init() from trap_init() to help early kgdb"). > > > > For secondary CPUs, setup the CPU hotplug handler at early_initcall. > > > > > > > > Fixes: 36dadef23fcc ("kprobes: Init kprobes in early_initcall") > > > > Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> > > > > > > Hi Jean, > > > > > > How have you confirmed this fixes the issue? > > > On my environment, this doesn't fix the issue. > > > > Oops, it was my mistake. I missed to boot up with Xen. (so I find another bug...) > > Anyway this works for me too. > > No worries :) Although now I've been wondering whether it would be better > to just disable the OS lock lazily, on the first call to > enable_debug_monitors(). It might add a tiny performance penalty but would > avoid this problem reappearing if one of the debugger needs to start even > earlier in the future. I'm still uneasy about enabling KDE with the watchpoint registers in an unknown state, so I think this needs more work. Will
On Tue, 27 Oct 2020 12:33:18 +0000 Will Deacon <will@kernel.org> wrote: > On Tue, Oct 27, 2020 at 12:59:09PM +0100, Jean-Philippe Brucker wrote: > > On Tue, Oct 27, 2020 at 07:42:58PM +0900, Masami Hiramatsu wrote: > > > On Tue, 27 Oct 2020 19:13:18 +0900 > > > Masami Hiramatsu <mhiramat@kernel.org> wrote: > > > > > > > On Mon, 26 Oct 2020 18:29:09 +0100 > > > > Jean-Philippe Brucker <jean-philippe@linaro.org> wrote: > > > > > > > > > To use debug features such as single-step, the OS lock must be unlocked > > > > > in the debug registers. Currently this is done in postcore_initcall > > > > > which is now too late. > > > > > > > > > > Commit 36dadef23fcc ("kprobes: Init kprobes in early_initcall") enabled > > > > > using kprobes from early_initcall, when OS lock is still locked. So when > > > > > kprobe attempts to single-step a patched instruction, instead of > > > > > trapping, execution continues until it throws an undef exception: > > > > > > > > > > [ 0.064233] Kprobe smoke test: started > > > > > [ 0.151133] ------------[ cut here ]------------ > > > > > [ 0.151458] kernel BUG at arch/arm64/kernel/traps.c:406! > > > > > [ 0.151812] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP > > > > > ... > > > > > [ 0.162689] Call trace: > > > > > [ 0.163014] do_undefinstr+0x1d4/0x1f4 > > > > > [ 0.163336] el1_sync_handler+0xbc/0x140 > > > > > [ 0.163839] el1_sync+0x80/0x100 > > > > > [ 0.164154] 0xffffffc01001d004 > > > > > [ 0.164527] init_kprobes+0x13c/0x154 > > > > > [ 0.164968] do_one_initcall+0x54/0x2e0 > > > > > [ 0.165322] kernel_init_freeable+0xf4/0x258 > > > > > [ 0.165783] kernel_init+0x20/0x12c > > > > > [ 0.166117] ret_from_fork+0x10/0x30 > > > > > [ 0.166595] Code: 97ffff53 a9425bf5 17ffff9b f9001bf7 (d4210000) > > > > > [ 0.167084] ---[ end trace 36778fdf576e9a79 ]--- > > > > > > > > > > To fix this, unlock the OS lock as early as possible. Do it in > > > > > traps_init() for CPU0, since KGDB wants to use single-step from that > > > > > point on according to commit b322c65f8ca3 ("arm64: Call > > > > > debug_traps_init() from trap_init() to help early kgdb"). > > > > > For secondary CPUs, setup the CPU hotplug handler at early_initcall. > > > > > > > > > > Fixes: 36dadef23fcc ("kprobes: Init kprobes in early_initcall") > > > > > Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> > > > > > > > > Hi Jean, > > > > > > > > How have you confirmed this fixes the issue? > > > > On my environment, this doesn't fix the issue. > > > > > > Oops, it was my mistake. I missed to boot up with Xen. (so I find another bug...) > > > Anyway this works for me too. > > > > No worries :) Although now I've been wondering whether it would be better > > to just disable the OS lock lazily, on the first call to > > enable_debug_monitors(). It might add a tiny performance penalty but would > > avoid this problem reappearing if one of the debugger needs to start even > > earlier in the future. > > I'm still uneasy about enabling KDE with the watchpoint registers in an > unknown state, so I think this needs more work. Hmm, how we reset it in the early stage? reset watchpoint registers first? Thanks, > > Will
On Tue, Oct 27, 2020 at 10:49:22PM +0900, Masami Hiramatsu wrote: > On Tue, 27 Oct 2020 12:33:18 +0000 > Will Deacon <will@kernel.org> wrote: > > > On Tue, Oct 27, 2020 at 12:59:09PM +0100, Jean-Philippe Brucker wrote: > > > On Tue, Oct 27, 2020 at 07:42:58PM +0900, Masami Hiramatsu wrote: > > > > On Tue, 27 Oct 2020 19:13:18 +0900 > > > > Masami Hiramatsu <mhiramat@kernel.org> wrote: > > > > > > > > > On Mon, 26 Oct 2020 18:29:09 +0100 > > > > > Jean-Philippe Brucker <jean-philippe@linaro.org> wrote: > > > > > > > > > > > To use debug features such as single-step, the OS lock must be unlocked > > > > > > in the debug registers. Currently this is done in postcore_initcall > > > > > > which is now too late. > > > > > > > > > > > > Commit 36dadef23fcc ("kprobes: Init kprobes in early_initcall") enabled > > > > > > using kprobes from early_initcall, when OS lock is still locked. So when > > > > > > kprobe attempts to single-step a patched instruction, instead of > > > > > > trapping, execution continues until it throws an undef exception: > > > > > > > > > > > > [ 0.064233] Kprobe smoke test: started > > > > > > [ 0.151133] ------------[ cut here ]------------ > > > > > > [ 0.151458] kernel BUG at arch/arm64/kernel/traps.c:406! > > > > > > [ 0.151812] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP > > > > > > ... > > > > > > [ 0.162689] Call trace: > > > > > > [ 0.163014] do_undefinstr+0x1d4/0x1f4 > > > > > > [ 0.163336] el1_sync_handler+0xbc/0x140 > > > > > > [ 0.163839] el1_sync+0x80/0x100 > > > > > > [ 0.164154] 0xffffffc01001d004 > > > > > > [ 0.164527] init_kprobes+0x13c/0x154 > > > > > > [ 0.164968] do_one_initcall+0x54/0x2e0 > > > > > > [ 0.165322] kernel_init_freeable+0xf4/0x258 > > > > > > [ 0.165783] kernel_init+0x20/0x12c > > > > > > [ 0.166117] ret_from_fork+0x10/0x30 > > > > > > [ 0.166595] Code: 97ffff53 a9425bf5 17ffff9b f9001bf7 (d4210000) > > > > > > [ 0.167084] ---[ end trace 36778fdf576e9a79 ]--- > > > > > > > > > > > > To fix this, unlock the OS lock as early as possible. Do it in > > > > > > traps_init() for CPU0, since KGDB wants to use single-step from that > > > > > > point on according to commit b322c65f8ca3 ("arm64: Call > > > > > > debug_traps_init() from trap_init() to help early kgdb"). > > > > > > For secondary CPUs, setup the CPU hotplug handler at early_initcall. > > > > > > > > > > > > Fixes: 36dadef23fcc ("kprobes: Init kprobes in early_initcall") > > > > > > Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> > > > > > > > > > > Hi Jean, > > > > > > > > > > How have you confirmed this fixes the issue? > > > > > On my environment, this doesn't fix the issue. > > > > > > > > Oops, it was my mistake. I missed to boot up with Xen. (so I find another bug...) > > > > Anyway this works for me too. > > > > > > No worries :) Although now I've been wondering whether it would be better > > > to just disable the OS lock lazily, on the first call to > > > enable_debug_monitors(). It might add a tiny performance penalty but would > > > avoid this problem reappearing if one of the debugger needs to start even > > > earlier in the future. > > > > I'm still uneasy about enabling KDE with the watchpoint registers in an > > unknown state, so I think this needs more work. > > Hmm, how we reset it in the early stage? reset watchpoint registers first? Yes, I think so. Same order problem as the OS lock, they need to be reset before enable_debug_monitors(). On CPU0 that would be before early_initcall and for secondaries the hotplug notifier needs to be installed earlier as well. I'll send a v2. Thanks, Jean
On Wed, Oct 28, 2020 at 09:28:20AM +0100, Jean-Philippe Brucker wrote: > On Tue, Oct 27, 2020 at 10:49:22PM +0900, Masami Hiramatsu wrote: > > On Tue, 27 Oct 2020 12:33:18 +0000 > > Will Deacon <will@kernel.org> wrote: > > > > > On Tue, Oct 27, 2020 at 12:59:09PM +0100, Jean-Philippe Brucker wrote: > > > > On Tue, Oct 27, 2020 at 07:42:58PM +0900, Masami Hiramatsu wrote: > > > > > On Tue, 27 Oct 2020 19:13:18 +0900 > > > > > Masami Hiramatsu <mhiramat@kernel.org> wrote: > > > > > > > > > > > On Mon, 26 Oct 2020 18:29:09 +0100 > > > > > > Jean-Philippe Brucker <jean-philippe@linaro.org> wrote: > > > > > > > > > > > > > To use debug features such as single-step, the OS lock must be unlocked > > > > > > > in the debug registers. Currently this is done in postcore_initcall > > > > > > > which is now too late. > > > > > > > > > > > > > > Commit 36dadef23fcc ("kprobes: Init kprobes in early_initcall") enabled > > > > > > > using kprobes from early_initcall, when OS lock is still locked. So when > > > > > > > kprobe attempts to single-step a patched instruction, instead of > > > > > > > trapping, execution continues until it throws an undef exception: > > > > > > > > > > > > > > [ 0.064233] Kprobe smoke test: started > > > > > > > [ 0.151133] ------------[ cut here ]------------ > > > > > > > [ 0.151458] kernel BUG at arch/arm64/kernel/traps.c:406! > > > > > > > [ 0.151812] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP > > > > > > > ... > > > > > > > [ 0.162689] Call trace: > > > > > > > [ 0.163014] do_undefinstr+0x1d4/0x1f4 > > > > > > > [ 0.163336] el1_sync_handler+0xbc/0x140 > > > > > > > [ 0.163839] el1_sync+0x80/0x100 > > > > > > > [ 0.164154] 0xffffffc01001d004 > > > > > > > [ 0.164527] init_kprobes+0x13c/0x154 > > > > > > > [ 0.164968] do_one_initcall+0x54/0x2e0 > > > > > > > [ 0.165322] kernel_init_freeable+0xf4/0x258 > > > > > > > [ 0.165783] kernel_init+0x20/0x12c > > > > > > > [ 0.166117] ret_from_fork+0x10/0x30 > > > > > > > [ 0.166595] Code: 97ffff53 a9425bf5 17ffff9b f9001bf7 (d4210000) > > > > > > > [ 0.167084] ---[ end trace 36778fdf576e9a79 ]--- > > > > > > > > > > > > > > To fix this, unlock the OS lock as early as possible. Do it in > > > > > > > traps_init() for CPU0, since KGDB wants to use single-step from that > > > > > > > point on according to commit b322c65f8ca3 ("arm64: Call > > > > > > > debug_traps_init() from trap_init() to help early kgdb"). > > > > > > > For secondary CPUs, setup the CPU hotplug handler at early_initcall. > > > > > > > > > > > > > > Fixes: 36dadef23fcc ("kprobes: Init kprobes in early_initcall") > > > > > > > Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> > > > > > > > > > > > > Hi Jean, > > > > > > > > > > > > How have you confirmed this fixes the issue? > > > > > > On my environment, this doesn't fix the issue. > > > > > > > > > > Oops, it was my mistake. I missed to boot up with Xen. (so I find another bug...) > > > > > Anyway this works for me too. > > > > > > > > No worries :) Although now I've been wondering whether it would be better > > > > to just disable the OS lock lazily, on the first call to > > > > enable_debug_monitors(). It might add a tiny performance penalty but would > > > > avoid this problem reappearing if one of the debugger needs to start even > > > > earlier in the future. > > > > > > I'm still uneasy about enabling KDE with the watchpoint registers in an > > > unknown state, so I think this needs more work. > > > > Hmm, how we reset it in the early stage? reset watchpoint registers first? > > Yes, I think so. Same order problem as the OS lock, they need to be reset > before enable_debug_monitors(). On CPU0 that would be before > early_initcall and for secondaries the hotplug notifier needs to be > installed earlier as well. I'll send a v2. Cheers. An alternative (which I think would be better in the long run anyway) would be to avoid using hardware step in kprobes and instead rely on a BRK instruction to trap after running the trampoline. Then we just need to ensure that the BRK handlers are up and running in time (which might require an early handler, like we have for KASAN). Will
On Wed, 28 Oct 2020 08:36:44 +0000 Will Deacon <will@kernel.org> wrote: > On Wed, Oct 28, 2020 at 09:28:20AM +0100, Jean-Philippe Brucker wrote: > > On Tue, Oct 27, 2020 at 10:49:22PM +0900, Masami Hiramatsu wrote: > > > On Tue, 27 Oct 2020 12:33:18 +0000 > > > Will Deacon <will@kernel.org> wrote: > > > > > > > On Tue, Oct 27, 2020 at 12:59:09PM +0100, Jean-Philippe Brucker wrote: > > > > > On Tue, Oct 27, 2020 at 07:42:58PM +0900, Masami Hiramatsu wrote: > > > > > > On Tue, 27 Oct 2020 19:13:18 +0900 > > > > > > Masami Hiramatsu <mhiramat@kernel.org> wrote: > > > > > > > > > > > > > On Mon, 26 Oct 2020 18:29:09 +0100 > > > > > > > Jean-Philippe Brucker <jean-philippe@linaro.org> wrote: > > > > > > > > > > > > > > > To use debug features such as single-step, the OS lock must be unlocked > > > > > > > > in the debug registers. Currently this is done in postcore_initcall > > > > > > > > which is now too late. > > > > > > > > > > > > > > > > Commit 36dadef23fcc ("kprobes: Init kprobes in early_initcall") enabled > > > > > > > > using kprobes from early_initcall, when OS lock is still locked. So when > > > > > > > > kprobe attempts to single-step a patched instruction, instead of > > > > > > > > trapping, execution continues until it throws an undef exception: > > > > > > > > > > > > > > > > [ 0.064233] Kprobe smoke test: started > > > > > > > > [ 0.151133] ------------[ cut here ]------------ > > > > > > > > [ 0.151458] kernel BUG at arch/arm64/kernel/traps.c:406! > > > > > > > > [ 0.151812] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP > > > > > > > > ... > > > > > > > > [ 0.162689] Call trace: > > > > > > > > [ 0.163014] do_undefinstr+0x1d4/0x1f4 > > > > > > > > [ 0.163336] el1_sync_handler+0xbc/0x140 > > > > > > > > [ 0.163839] el1_sync+0x80/0x100 > > > > > > > > [ 0.164154] 0xffffffc01001d004 > > > > > > > > [ 0.164527] init_kprobes+0x13c/0x154 > > > > > > > > [ 0.164968] do_one_initcall+0x54/0x2e0 > > > > > > > > [ 0.165322] kernel_init_freeable+0xf4/0x258 > > > > > > > > [ 0.165783] kernel_init+0x20/0x12c > > > > > > > > [ 0.166117] ret_from_fork+0x10/0x30 > > > > > > > > [ 0.166595] Code: 97ffff53 a9425bf5 17ffff9b f9001bf7 (d4210000) > > > > > > > > [ 0.167084] ---[ end trace 36778fdf576e9a79 ]--- > > > > > > > > > > > > > > > > To fix this, unlock the OS lock as early as possible. Do it in > > > > > > > > traps_init() for CPU0, since KGDB wants to use single-step from that > > > > > > > > point on according to commit b322c65f8ca3 ("arm64: Call > > > > > > > > debug_traps_init() from trap_init() to help early kgdb"). > > > > > > > > For secondary CPUs, setup the CPU hotplug handler at early_initcall. > > > > > > > > > > > > > > > > Fixes: 36dadef23fcc ("kprobes: Init kprobes in early_initcall") > > > > > > > > Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> > > > > > > > > > > > > > > Hi Jean, > > > > > > > > > > > > > > How have you confirmed this fixes the issue? > > > > > > > On my environment, this doesn't fix the issue. > > > > > > > > > > > > Oops, it was my mistake. I missed to boot up with Xen. (so I find another bug...) > > > > > > Anyway this works for me too. > > > > > > > > > > No worries :) Although now I've been wondering whether it would be better > > > > > to just disable the OS lock lazily, on the first call to > > > > > enable_debug_monitors(). It might add a tiny performance penalty but would > > > > > avoid this problem reappearing if one of the debugger needs to start even > > > > > earlier in the future. > > > > > > > > I'm still uneasy about enabling KDE with the watchpoint registers in an > > > > unknown state, so I think this needs more work. > > > > > > Hmm, how we reset it in the early stage? reset watchpoint registers first? > > > > Yes, I think so. Same order problem as the OS lock, they need to be reset > > before enable_debug_monitors(). On CPU0 that would be before > > early_initcall and for secondaries the hotplug notifier needs to be > > installed earlier as well. I'll send a v2. > > Cheers. An alternative (which I think would be better in the long run > anyway) would be to avoid using hardware step in kprobes and instead rely > on a BRK instruction to trap after running the trampoline. But how we trap the instruction which can change pc? (like br?) Are all those instruction emulated now? Thank you, > Then we just need to ensure that the BRK handlers are up and running in > time (which might require an early handler, like we have for KASAN). > > Will
On Wed, Oct 28, 2020 at 06:07:31PM +0900, Masami Hiramatsu wrote: > > > Yes, I think so. Same order problem as the OS lock, they need to be reset > > > before enable_debug_monitors(). On CPU0 that would be before > > > early_initcall and for secondaries the hotplug notifier needs to be > > > installed earlier as well. I'll send a v2. > > > > Cheers. An alternative (which I think would be better in the long run > > anyway) would be to avoid using hardware step in kprobes and instead rely > > on a BRK instruction to trap after running the trampoline. > > But how we trap the instruction which can change pc? (like br?) > Are all those instruction emulated now? According to aarch64_insn_is_steppable() anything that changes the PC is emulated. I'm also checking whether there is a change of behavior with synchronous exceptions taken while single-stepping (page faults). Thanks, Jean
On Wed, 28 Oct 2020 10:48:27 +0100 Jean-Philippe Brucker <jean-philippe@linaro.org> wrote: > On Wed, Oct 28, 2020 at 06:07:31PM +0900, Masami Hiramatsu wrote: > > > > Yes, I think so. Same order problem as the OS lock, they need to be reset > > > > before enable_debug_monitors(). On CPU0 that would be before > > > > early_initcall and for secondaries the hotplug notifier needs to be > > > > installed earlier as well. I'll send a v2. > > > > > > Cheers. An alternative (which I think would be better in the long run > > > anyway) would be to avoid using hardware step in kprobes and instead rely > > > on a BRK instruction to trap after running the trampoline. > > > > But how we trap the instruction which can change pc? (like br?) > > Are all those instruction emulated now? > > According to aarch64_insn_is_steppable() anything that changes the PC is > emulated. OK, that sounds good. Then we can put the BRK right after the copied instruction. > I'm also checking whether there is a change of behavior with > synchronous exceptions taken while single-stepping (page faults). Thanks! From the kprobe_fault_handler(), it seems the page faults handled before the single-stepping exception and the fault handler disables single steping explicitly. So if we use BRK, that code will not be needed. Thank you,
Hi Will, On Wed, 28 Oct 2020 08:36:44 +0000 Will Deacon <will@kernel.org> wrote: > On Wed, Oct 28, 2020 at 09:28:20AM +0100, Jean-Philippe Brucker wrote: > > On Tue, Oct 27, 2020 at 10:49:22PM +0900, Masami Hiramatsu wrote: > > > On Tue, 27 Oct 2020 12:33:18 +0000 > > > Will Deacon <will@kernel.org> wrote: > > > > > > > On Tue, Oct 27, 2020 at 12:59:09PM +0100, Jean-Philippe Brucker wrote: > > > > > On Tue, Oct 27, 2020 at 07:42:58PM +0900, Masami Hiramatsu wrote: > > > > > > On Tue, 27 Oct 2020 19:13:18 +0900 > > > > > > Masami Hiramatsu <mhiramat@kernel.org> wrote: > > > > > > > > > > > > > On Mon, 26 Oct 2020 18:29:09 +0100 > > > > > > > Jean-Philippe Brucker <jean-philippe@linaro.org> wrote: > > > > > > > > > > > > > > > To use debug features such as single-step, the OS lock must be unlocked > > > > > > > > in the debug registers. Currently this is done in postcore_initcall > > > > > > > > which is now too late. > > > > > > > > > > > > > > > > Commit 36dadef23fcc ("kprobes: Init kprobes in early_initcall") enabled > > > > > > > > using kprobes from early_initcall, when OS lock is still locked. So when > > > > > > > > kprobe attempts to single-step a patched instruction, instead of > > > > > > > > trapping, execution continues until it throws an undef exception: > > > > > > > > > > > > > > > > [ 0.064233] Kprobe smoke test: started > > > > > > > > [ 0.151133] ------------[ cut here ]------------ > > > > > > > > [ 0.151458] kernel BUG at arch/arm64/kernel/traps.c:406! > > > > > > > > [ 0.151812] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP > > > > > > > > ... > > > > > > > > [ 0.162689] Call trace: > > > > > > > > [ 0.163014] do_undefinstr+0x1d4/0x1f4 > > > > > > > > [ 0.163336] el1_sync_handler+0xbc/0x140 > > > > > > > > [ 0.163839] el1_sync+0x80/0x100 > > > > > > > > [ 0.164154] 0xffffffc01001d004 > > > > > > > > [ 0.164527] init_kprobes+0x13c/0x154 > > > > > > > > [ 0.164968] do_one_initcall+0x54/0x2e0 > > > > > > > > [ 0.165322] kernel_init_freeable+0xf4/0x258 > > > > > > > > [ 0.165783] kernel_init+0x20/0x12c > > > > > > > > [ 0.166117] ret_from_fork+0x10/0x30 > > > > > > > > [ 0.166595] Code: 97ffff53 a9425bf5 17ffff9b f9001bf7 (d4210000) > > > > > > > > [ 0.167084] ---[ end trace 36778fdf576e9a79 ]--- > > > > > > > > > > > > > > > > To fix this, unlock the OS lock as early as possible. Do it in > > > > > > > > traps_init() for CPU0, since KGDB wants to use single-step from that > > > > > > > > point on according to commit b322c65f8ca3 ("arm64: Call > > > > > > > > debug_traps_init() from trap_init() to help early kgdb"). > > > > > > > > For secondary CPUs, setup the CPU hotplug handler at early_initcall. > > > > > > > > > > > > > > > > Fixes: 36dadef23fcc ("kprobes: Init kprobes in early_initcall") > > > > > > > > Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> > > > > > > > > > > > > > > Hi Jean, > > > > > > > > > > > > > > How have you confirmed this fixes the issue? > > > > > > > On my environment, this doesn't fix the issue. > > > > > > > > > > > > Oops, it was my mistake. I missed to boot up with Xen. (so I find another bug...) > > > > > > Anyway this works for me too. > > > > > > > > > > No worries :) Although now I've been wondering whether it would be better > > > > > to just disable the OS lock lazily, on the first call to > > > > > enable_debug_monitors(). It might add a tiny performance penalty but would > > > > > avoid this problem reappearing if one of the debugger needs to start even > > > > > earlier in the future. > > > > > > > > I'm still uneasy about enabling KDE with the watchpoint registers in an > > > > unknown state, so I think this needs more work. > > > > > > Hmm, how we reset it in the early stage? reset watchpoint registers first? > > > > Yes, I think so. Same order problem as the OS lock, they need to be reset > > before enable_debug_monitors(). On CPU0 that would be before > > early_initcall and for secondaries the hotplug notifier needs to be > > installed earlier as well. I'll send a v2. > > Cheers. An alternative (which I think would be better in the long run > anyway) would be to avoid using hardware step in kprobes and instead rely > on a BRK instruction to trap after running the trampoline. We started working on using the BRK instead of hardware step in kprobes in other threads. However, there still be a bug in the kernel. I would like to fix or at least mitigate this issue until this is released (since it's a bug) Would you think we can push the BRK only kprobes until it or in stable kernel? Or, we should add a mitigation patch for this bug? For the mitigation, I think we can introduce a kconfig flag which indicates the arch doesn't support early kprobes, in that case we defer the kprobe and boot-time trace later stage. This flag will be removed after we introduce the BRK-only kprobes. Thank you,
Hi Masami, On Thu, Nov 26, 2020 at 01:09:06AM +0900, Masami Hiramatsu wrote: > On Wed, 28 Oct 2020 08:36:44 +0000 > Will Deacon <will@kernel.org> wrote: > > Cheers. An alternative (which I think would be better in the long run > > anyway) would be to avoid using hardware step in kprobes and instead rely > > on a BRK instruction to trap after running the trampoline. > > We started working on using the BRK instead of hardware step in kprobes > in other threads. However, there still be a bug in the kernel. > I would like to fix or at least mitigate this issue until this is released > (since it's a bug) > > Would you think we can push the BRK only kprobes until it or in stable kernel? > Or, we should add a mitigation patch for this bug? > For the mitigation, I think we can introduce a kconfig flag which indicates > the arch doesn't support early kprobes, in that case we defer the kprobe and > boot-time trace later stage. This flag will be removed after we introduce the > BRK-only kprobes. The BRK stuff is merged upstream: http://git.kernel.org/linus/7ee31a3aa8f49 Are you saying that this isn't sufficient to fix the problem? Will
On Wed, 25 Nov 2020 16:11:34 +0000 Will Deacon <will@kernel.org> wrote: > Hi Masami, > > On Thu, Nov 26, 2020 at 01:09:06AM +0900, Masami Hiramatsu wrote: > > On Wed, 28 Oct 2020 08:36:44 +0000 > > Will Deacon <will@kernel.org> wrote: > > > Cheers. An alternative (which I think would be better in the long run > > > anyway) would be to avoid using hardware step in kprobes and instead rely > > > on a BRK instruction to trap after running the trampoline. > > > > We started working on using the BRK instead of hardware step in kprobes > > in other threads. However, there still be a bug in the kernel. > > I would like to fix or at least mitigate this issue until this is released > > (since it's a bug) > > > > Would you think we can push the BRK only kprobes until it or in stable kernel? > > Or, we should add a mitigation patch for this bug? > > For the mitigation, I think we can introduce a kconfig flag which indicates > > the arch doesn't support early kprobes, in that case we defer the kprobe and > > boot-time trace later stage. This flag will be removed after we introduce the > > BRK-only kprobes. > > The BRK stuff is merged upstream: > > http://git.kernel.org/linus/7ee31a3aa8f49 > > Are you saying that this isn't sufficient to fix the problem? Oops, No, it should be enough. I thought we were still in discussion on the other thread... Anyway, thank you for merging!
diff --git a/arch/arm64/kernel/debug-monitors.c b/arch/arm64/kernel/debug-monitors.c index 75a423c3336a..80f082021234 100644 --- a/arch/arm64/kernel/debug-monitors.c +++ b/arch/arm64/kernel/debug-monitors.c @@ -135,7 +135,7 @@ static int __init debug_monitors_init(void) "arm64/debug_monitors:starting", clear_os_lock, NULL); } -postcore_initcall(debug_monitors_init); +early_initcall(debug_monitors_init); /* * Single step API and exception handling. @@ -380,6 +380,7 @@ NOKPROBE_SYMBOL(aarch32_break_handler); void __init debug_traps_init(void) { + clear_os_lock(0); hook_debug_fault_code(DBG_ESR_EVT_HWSS, single_step_handler, SIGTRAP, TRAP_TRACE, "single-step handler"); hook_debug_fault_code(DBG_ESR_EVT_BRK, brk_handler, SIGTRAP,
To use debug features such as single-step, the OS lock must be unlocked in the debug registers. Currently this is done in postcore_initcall which is now too late. Commit 36dadef23fcc ("kprobes: Init kprobes in early_initcall") enabled using kprobes from early_initcall, when OS lock is still locked. So when kprobe attempts to single-step a patched instruction, instead of trapping, execution continues until it throws an undef exception: [ 0.064233] Kprobe smoke test: started [ 0.151133] ------------[ cut here ]------------ [ 0.151458] kernel BUG at arch/arm64/kernel/traps.c:406! [ 0.151812] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ... [ 0.162689] Call trace: [ 0.163014] do_undefinstr+0x1d4/0x1f4 [ 0.163336] el1_sync_handler+0xbc/0x140 [ 0.163839] el1_sync+0x80/0x100 [ 0.164154] 0xffffffc01001d004 [ 0.164527] init_kprobes+0x13c/0x154 [ 0.164968] do_one_initcall+0x54/0x2e0 [ 0.165322] kernel_init_freeable+0xf4/0x258 [ 0.165783] kernel_init+0x20/0x12c [ 0.166117] ret_from_fork+0x10/0x30 [ 0.166595] Code: 97ffff53 a9425bf5 17ffff9b f9001bf7 (d4210000) [ 0.167084] ---[ end trace 36778fdf576e9a79 ]--- To fix this, unlock the OS lock as early as possible. Do it in traps_init() for CPU0, since KGDB wants to use single-step from that point on according to commit b322c65f8ca3 ("arm64: Call debug_traps_init() from trap_init() to help early kgdb"). For secondary CPUs, setup the CPU hotplug handler at early_initcall. Fixes: 36dadef23fcc ("kprobes: Init kprobes in early_initcall") Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> --- arch/arm64/kernel/debug-monitors.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)