Message ID | 20220124081501.235236-1-luofei@unicloud.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | x86/mce: Always call kill_me_maybe() to handle memory failure in user mode | expand |
On Mon, Jan 24, 2022 at 03:15:01AM -0500, luofei wrote: > Just killing the current process is not enough, it is necessory > to offload the faulty page. > > In the virtualization scenario, qemu does not set MCG_STATUS_RIPV by > default. Yes, we've had this before. Fix qemu.
diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c index 5818b837fd4d..bc6c353b9250 100644 --- a/arch/x86/kernel/cpu/mce/core.c +++ b/arch/x86/kernel/cpu/mce/core.c @@ -1519,10 +1519,8 @@ noinstr void do_machine_check(struct pt_regs *regs) BUG_ON(!on_thread_stack() || !user_mode(regs)); if (kill_current_task) - queue_task_work(&m, msg, kill_me_now); - else - queue_task_work(&m, msg, kill_me_maybe); - + force_sig(SIGBUS); + queue_task_work(&m, msg, kill_me_maybe); } else { /* * Handle an MCE which has happened in kernel space but from
Just killing the current process is not enough, it is necessory to offload the faulty page. In the virtualization scenario, qemu does not set MCG_STATUS_RIPV by default. When injecting an SRAR error into the virtual machine, only the current process will be killed, but the faulty page will be released and reused, which is very likely to cause the virtual machine to crash. Signed-off-by: luofei <luofei@unicloud.com> --- arch/x86/kernel/cpu/mce/core.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)