Message ID | 20210222113124.35f2d552@alex-virtual-machine (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | x86/mce: fix wrong no-return-ip logic in do_machine_check() | expand |
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c index e133ce1e562b..ae09b0279422 100644 --- a/arch/x86/kernel/cpu/mce/core.c +++ b/arch/x86/kernel/cpu/mce/core.c @@ -1413,9 +1413,10 @@ noinstr void do_machine_check(struct pt_regs *regs) if ((m.cs & 3) == 3) { /* If this triggers there is no way to recover. Die hard. */ BUG_ON(!on_thread_stack() || !user_mode(regs)); - - queue_task_work(&m, kill_current_task); - + if (worst == MCE_AR_SEVERITY) + queue_task_work(&m, 0); + else if (kill_current_task) + queue_task_work(&m, kill_current_task); } else { /* * Handle an MCE which has happened in kernel space but from
From commit b2f9d678e28c ("x86/mce: Check for faults tagged in EXTABLE_CLASS_FAULT exception table entries"), When there is a memory MCE_AR_SEVERITY error with no return ip, Only a SIGBUS signal is send to current. As the page is not poisoned, the SIGBUS process coredump step in kernel will touch the error page again, whick result to a fatal error. We need to poison the page and then kill current in memory-failure module. So fix it using the orinigal checking method. Signed-off-by: Aili Yao <yaoaili@kingsoft.com> --- arch/x86/kernel/cpu/mce/core.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)