diff mbox series

perf: fix panic by disable ftrace on fault.c

Message ID d16e7188-1afa-7513-990c-804811747bcb@linux.alibaba.com (mailing list archive)
State Not Applicable
Headers show
Series perf: fix panic by disable ftrace on fault.c | expand

Checks

Context Check Description
netdev/tree_selection success Not a local patch

Commit Message

王贇 Sept. 13, 2021, 3:30 a.m. UTC
When running with ftrace function enabled, we observed panic
as below:

  traps: PANIC: double fault, error_code: 0x0
  [snip]
  RIP: 0010:perf_swevent_get_recursion_context+0x0/0x70
  [snip]
  Call Trace:
   <NMI>
   perf_trace_buf_alloc+0x26/0xd0
   perf_ftrace_function_call+0x18f/0x2e0
   kernelmode_fixup_or_oops+0x5/0x120
   __bad_area_nosemaphore+0x1b8/0x280
   do_user_addr_fault+0x410/0x920
   exc_page_fault+0x92/0x300
   asm_exc_page_fault+0x1e/0x30
  RIP: 0010:__get_user_nocheck_8+0x6/0x13
   perf_callchain_user+0x266/0x2f0
   get_perf_callchain+0x194/0x210
   perf_callchain+0xa3/0xc0
   perf_prepare_sample+0xa5/0xa60
   perf_event_output_forward+0x7b/0x1b0
   __perf_event_overflow+0x67/0x120
   perf_swevent_overflow+0xcb/0x110
   perf_swevent_event+0xb0/0xf0
   perf_tp_event+0x292/0x410
   perf_trace_run_bpf_submit+0x87/0xc0
   perf_trace_lock_acquire+0x12b/0x170
   lock_acquire+0x1bf/0x2e0
   perf_output_begin+0x70/0x4b0
   perf_log_throttle+0xe2/0x1a0
   perf_event_nmi_handler+0x30/0x50
   nmi_handle+0xba/0x2a0
   default_do_nmi+0x45/0xf0
   exc_nmi+0x155/0x170
   end_repeat_nmi+0x16/0x55

According to the trace we know the story is like this, the NMI
triggered perf IRQ throttling and call perf_log_throttle(),
which triggered the swevent overflow, and the overflow process
do perf_callchain_user() which triggered a user PF, and the PF
process triggered perf ftrace which finally lead into a suspected
stack overflow.

This patch disable ftrace on fault.c, which help to avoid the panic.

Reported-by: Abaci <abaci@linux.alibaba.com>
Signed-off-by: Michael Wang <yun.wang@linux.alibaba.com>
---
 arch/x86/mm/Makefile | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Dave Hansen Sept. 13, 2021, 2:49 p.m. UTC | #1
On 9/12/21 8:30 PM, 王贇 wrote:
> According to the trace we know the story is like this, the NMI
> triggered perf IRQ throttling and call perf_log_throttle(),
> which triggered the swevent overflow, and the overflow process
> do perf_callchain_user() which triggered a user PF, and the PF
> process triggered perf ftrace which finally lead into a suspected
> stack overflow.
> 
> This patch disable ftrace on fault.c, which help to avoid the panic.
...
> +# Disable ftrace to avoid stack overflow.
> +CFLAGS_REMOVE_fault.o = $(CC_FLAGS_FTRACE)

Was this observed on a mainline kernel?

How reproducible is this?

I suspect we're going into do_user_addr_fault(), then falling in here:

>         if (unlikely(faulthandler_disabled() || !mm)) {
>                 bad_area_nosemaphore(regs, error_code, address);
>                 return;
>         }

Then something double faults in perf_swevent_get_recursion_context().
But, you snipped all of the register dump out so I can't quite see
what's going on and what might have caused *that* fault.  But, in my
kernel perf_swevent_get_recursion_context+0x0/0x70 is:

	   mov    $0x27d00,%rdx

which is rather unlikely to fault.

Either way, we don't want to keep ftrace out of fault.c.  This patch is
just a hack, and doesn't really try to fix the underlying problem.  This
situation *should* be handled today.  There's code there to handle it.

Something else really funky is going on.
王贇 Sept. 14, 2021, 1:52 a.m. UTC | #2
Hi, Dave, Peter

Nice to have you guys digging the root cause, please allow me to paste whole
trace and the way of reproduce here firstly before checking the details:

Below is the full trace, triggered with the latest linux-next master branch:

[   58.999453][    C0] traps: PANIC: double fault, error_code: 0x0
[   58.999472][    C0] double fault: 0000 [#1] SMP PTI
[   58.999478][    C0] CPU: 0 PID: 799 Comm: a.out Not tainted 5.14.0+ #107
[   58.999485][    C0] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[   58.999488][    C0] RIP: 0010:perf_swevent_get_recursion_context+0x0/0x70
[   58.999505][    C0] Code: 48 03 43 28 48 8b 0c 24 bb 01 00 00 00 4c 29 f0 48 39 c8 48 0f 47 c1 49 89 45 08 e9 48 ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 <55> 53 e8 89 18 f2 ff 48 c7 c2 20 4d 03 00 65 48 03 15 5a 34 d2 7e
[   58.999511][    C0] RSP: 0018:fffffe000000b000 EFLAGS: 00010046
[   58.999517][    C0] RAX: 0000000080120005 RBX: fffffe000000b050 RCX: 0000000000000000
[   58.999522][    C0] RDX: ffff888106f5a180 RSI: ffffffff812696d1 RDI: 000000000000001c
[   58.999526][    C0] RBP: 000000000000001c R08: 0000000000000001 R09: 0000000000000000
[   58.999530][    C0] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[   58.999533][    C0] R13: fffffe000000b044 R14: 0000000000000001 R15: 0000000000000001
[   58.999537][    C0] FS:  00007f21fc62c740(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000
[   58.999543][    C0] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   58.999547][    C0] CR2: fffffe000000aff8 CR3: 0000000106e2e001 CR4: 00000000003606f0
[   58.999551][    C0] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   58.999555][    C0] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   58.999559][    C0] Call Trace:
[   58.999562][    C0]  <NMI>
[   58.999565][    C0]  perf_trace_buf_alloc+0x26/0xd0
[   58.999579][    C0]  ? is_prefetch.isra.25+0x260/0x260
[   58.999586][    C0]  ? __bad_area_nosemaphore+0x1b8/0x280
[   58.999592][    C0]  perf_ftrace_function_call+0x18f/0x2e0
[   58.999604][    C0]  ? perf_trace_buf_alloc+0xbf/0xd0
[   58.999642][    C0]  ? 0xffffffffa00ba083
[   58.999669][    C0]  0xffffffffa00ba083
[   58.999688][    C0]  ? 0xffffffffa00ba083
[   58.999708][    C0]  ? kernelmode_fixup_or_oops+0x5/0x120
[   58.999721][    C0]  kernelmode_fixup_or_oops+0x5/0x120
[   58.999728][    C0]  __bad_area_nosemaphore+0x1b8/0x280
[   58.999747][    C0]  do_user_addr_fault+0x410/0x920
[   58.999763][    C0]  ? 0xffffffffa00ba083
[   58.999780][    C0]  exc_page_fault+0x92/0x300
[   58.999796][    C0]  asm_exc_page_fault+0x1e/0x30
[   58.999805][    C0] RIP: 0010:__get_user_nocheck_8+0x6/0x13
[   58.999814][    C0] Code: 01 ca c3 90 0f 01 cb 0f ae e8 0f b7 10 31 c0 0f 01 ca c3 90 0f 01 cb 0f ae e8 8b 10 31 c0 0f 01 ca c3 66 90 0f 01 cb 0f ae e8 <48> 8b 10 31 c0 0f 01 ca c3 90 0f 01 ca 31 d2 48 c7 c0 f2 ff ff ff
[   58.999819][    C0] RSP: 0018:fffffe000000b370 EFLAGS: 00050046
[   58.999825][    C0] RAX: 0000000000000000 RBX: fffffe000000b3d0 RCX: 0000000000000000
[   58.999828][    C0] RDX: ffff888106f5a180 RSI: ffffffff8100a91e RDI: fffffe000000b3d0
[   58.999832][    C0] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
[   58.999836][    C0] R10: 0000000000000000 R11: 0000000000000014 R12: 00007fffffffeff0
[   58.999839][    C0] R13: ffff888106f5a180 R14: 000000000000007f R15: 000000000000007f
[   58.999867][    C0]  ? perf_callchain_user+0x25e/0x2f0
[   58.999886][    C0]  perf_callchain_user+0x266/0x2f0
[   58.999907][    C0]  get_perf_callchain+0x194/0x210
[   58.999938][    C0]  perf_callchain+0xa3/0xc0
[   58.999956][    C0]  perf_prepare_sample+0xa5/0xa60
[   58.999984][    C0]  perf_event_output_forward+0x7b/0x1b0
[   58.999996][    C0]  ? perf_swevent_get_recursion_context+0x62/0x70
[   59.000008][    C0]  ? perf_trace_buf_alloc+0xbf/0xd0
[   59.000026][    C0]  __perf_event_overflow+0x67/0x120
[   59.000042][    C0]  perf_swevent_overflow+0xcb/0x110
[   59.000065][    C0]  perf_swevent_event+0xb0/0xf0
[   59.000078][    C0]  perf_tp_event+0x292/0x410
[   59.000085][    C0]  ? 0xffffffffa00ba083
[   59.000120][    C0]  ? tracing_gen_ctx_irq_test+0x8f/0xa0
[   59.000129][    C0]  ? perf_swevent_event+0x28/0xf0
[   59.000142][    C0]  ? perf_tp_event+0x2d7/0x410
[   59.000150][    C0]  ? tracing_gen_ctx_irq_test+0x8f/0xa0
[   59.000157][    C0]  ? perf_swevent_event+0x28/0xf0
[   59.000171][    C0]  ? perf_tp_event+0x2d7/0x410
[   59.000179][    C0]  ? tracing_gen_ctx_irq_test+0x8f/0xa0
[   59.000198][    C0]  ? tracing_gen_ctx_irq_test+0x8f/0xa0
[   59.000206][    C0]  ? perf_swevent_event+0x28/0xf0
[   59.000233][    C0]  ? perf_trace_run_bpf_submit+0x87/0xc0
[   59.000244][    C0]  ? perf_trace_buf_alloc+0x86/0xd0
[   59.000250][    C0]  perf_trace_run_bpf_submit+0x87/0xc0
[   59.000276][    C0]  perf_trace_lock_acquire+0x12b/0x170
[   59.000308][    C0]  lock_acquire+0x1bf/0x2e0
[   59.000317][    C0]  ? perf_output_begin+0x5/0x4b0
[   59.000348][    C0]  perf_output_begin+0x70/0x4b0
[   59.000356][    C0]  ? perf_output_begin+0x5/0x4b0
[   59.000394][    C0]  perf_log_throttle+0xe2/0x1a0
[   59.000431][    C0]  ? 0xffffffffa00ba083
[   59.000447][    C0]  ? perf_event_update_userpage+0x135/0x2d0
[   59.000462][    C0]  ? 0xffffffffa00ba083
[   59.000471][    C0]  ? 0xffffffffa00ba083
[   59.000495][    C0]  ? perf_event_update_userpage+0x135/0x2d0
[   59.000506][    C0]  ? rcu_read_lock_held_common+0x5/0x40
[   59.000519][    C0]  ? rcu_read_lock_held_common+0xe/0x40
[   59.000528][    C0]  ? rcu_read_lock_sched_held+0x23/0x80
[   59.000539][    C0]  ? lock_release+0xc7/0x2b0
[   59.000560][    C0]  ? __perf_event_account_interrupt+0x116/0x160
[   59.000576][    C0]  __perf_event_account_interrupt+0x116/0x160
[   59.000589][    C0]  __perf_event_overflow+0x3e/0x120
[   59.000604][    C0]  handle_pmi_common+0x30f/0x400
[   59.000611][    C0]  ? perf_ftrace_function_call+0x268/0x2e0
[   59.000620][    C0]  ? perf_ftrace_function_call+0x53/0x2e0
[   59.000663][    C0]  ? 0xffffffffa00ba083
[   59.000689][    C0]  ? 0xffffffffa00ba083
[   59.000729][    C0]  ? intel_pmu_handle_irq+0x120/0x620
[   59.000737][    C0]  ? handle_pmi_common+0x5/0x400
[   59.000743][    C0]  intel_pmu_handle_irq+0x120/0x620
[   59.000767][    C0]  perf_event_nmi_handler+0x30/0x50
[   59.000779][    C0]  nmi_handle+0xba/0x2a0
[   59.000806][    C0]  default_do_nmi+0x45/0xf0
[   59.000819][    C0]  exc_nmi+0x155/0x170
[   59.000838][    C0]  end_repeat_nmi+0x16/0x55
[   59.000845][    C0] RIP: 0010:__sanitizer_cov_trace_pc+0xd/0x60
[   59.000853][    C0] Code: 00 75 10 65 48 8b 04 25 c0 71 01 00 48 8b 80 88 15 00 00 f3 c3 0f 1f 84 00 00 00 00 00 65 8b 05 09 77 e0 7e 89 c1 48 8b 34 24 <65> 48 8b 14 25 c0 71 01 00 81 e1 00 01 00 00 a9 00 01 ff 00 74 10
[   59.000858][    C0] RSP: 0000:ffffc90000003dd0 EFLAGS: 00000046
[   59.000863][    C0] RAX: 0000000080010001 RBX: ffffffff82a1db40 RCX: 0000000080010001
[   59.000867][    C0] RDX: ffff888106f5a180 RSI: ffffffff81009613 RDI: 0000000000000000
[   59.000871][    C0] RBP: ffff88813bc40d08 R08: ffff888106f5abb8 R09: 00000000fffffffe
[   59.000875][    C0] R10: ffffc90000003be0 R11: 00000000ffd17b4b R12: ffff88813bc118a0
[   59.000878][    C0] R13: ffff88813bc40c00 R14: 0000000000000000 R15: ffffffff82a1db40
[   59.000906][    C0]  ? x86_pmu_enable+0x383/0x440
[   59.000924][    C0]  ? __sanitizer_cov_trace_pc+0xd/0x60
[   59.000942][    C0]  ? intel_pmu_handle_irq+0x284/0x620
[   59.000954][    C0]  </NMI>
[   59.000957][    C0] WARNING: stack recursion on stack type 6
[   59.000960][    C0] Modules linked in:
[   59.120070][    C0] ---[ end trace 07eb1e3908914794 ]---
[   59.120075][    C0] RIP: 0010:perf_swevent_get_recursion_context+0x0/0x70
[   59.120087][    C0] Code: 48 03 43 28 48 8b 0c 24 bb 01 00 00 00 4c 29 f0 48 39 c8 48 0f 47 c1 49 89 45 08 e9 48 ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 <55> 53 e8 89 18 f2 ff 48 c7 c2 20 4d 03 00 65 48 03 15 5a 34 d2 7e
[   59.120092][    C0] RSP: 0018:fffffe000000b000 EFLAGS: 00010046
[   59.120098][    C0] RAX: 0000000080120005 RBX: fffffe000000b050 RCX: 0000000000000000
[   59.120102][    C0] RDX: ffff888106f5a180 RSI: ffffffff812696d1 RDI: 000000000000001c
[   59.120106][    C0] RBP: 000000000000001c R08: 0000000000000001 R09: 0000000000000000
[   59.120110][    C0] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[   59.120114][    C0] R13: fffffe000000b044 R14: 0000000000000001 R15: 0000000000000001
[   59.120118][    C0] FS:  00007f21fc62c740(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000
[   59.120125][    C0] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   59.120129][    C0] CR2: fffffe000000aff8 CR3: 0000000106e2e001 CR4: 00000000003606f0
[   59.120133][    C0] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   59.120137][    C0] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   59.120141][    C0] Kernel panic - not syncing: Fatal exception in interrupt
[   59.120540][    C0] Kernel Offset: disabled

And below is the way of reproduce:


// autogenerated by syzkaller (https://github.com/google/syzkaller)

#define _GNU_SOURCE

#include <dirent.h>
#include <endian.h>
#include <errno.h>
#include <fcntl.h>
#include <signal.h>
#include <stdarg.h>
#include <stdbool.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/prctl.h>
#include <sys/stat.h>
#include <sys/syscall.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <time.h>
#include <unistd.h>

static void sleep_ms(uint64_t ms)
{
	usleep(ms * 1000);
}

static uint64_t current_time_ms(void)
{
	struct timespec ts;
	if (clock_gettime(CLOCK_MONOTONIC, &ts))
	exit(1);
	return (uint64_t)ts.tv_sec * 1000 + (uint64_t)ts.tv_nsec / 1000000;
}

#define BITMASK(bf_off,bf_len) (((1ull << (bf_len)) - 1) << (bf_off))
#define STORE_BY_BITMASK(type,htobe,addr,val,bf_off,bf_len) *(type*)(addr) = htobe((htobe(*(type*)(addr)) & ~BITMASK((bf_off), (bf_len))) | (((type)(val) << (bf_off)) & BITMASK((bf_off), (bf_len))))

static bool write_file(const char* file, const char* what, ...)
{
	char buf[1024];
	va_list args;
	va_start(args, what);
	vsnprintf(buf, sizeof(buf), what, args);
	va_end(args);
	buf[sizeof(buf) - 1] = 0;
	int len = strlen(buf);
	int fd = open(file, O_WRONLY | O_CLOEXEC);
	if (fd == -1)
		return false;
	if (write(fd, buf, len) != len) {
		int err = errno;
		close(fd);
		errno = err;
		return false;
	}
	close(fd);
	return true;
}

static void kill_and_wait(int pid, int* status)
{
	kill(-pid, SIGKILL);
	kill(pid, SIGKILL);
	for (int i = 0; i < 100; i++) {
		if (waitpid(-1, status, WNOHANG | __WALL) == pid)
			return;
		usleep(1000);
	}
	DIR* dir = opendir("/sys/fs/fuse/connections");
	if (dir) {
		for (;;) {
			struct dirent* ent = readdir(dir);
			if (!ent)
				break;
			if (strcmp(ent->d_name, ".") == 0 || strcmp(ent->d_name, "..") == 0)
				continue;
			char abort[300];
			snprintf(abort, sizeof(abort), "/sys/fs/fuse/connections/%s/abort", ent->d_name);
			int fd = open(abort, O_WRONLY);
			if (fd == -1) {
				continue;
			}
			if (write(fd, abort, 1) < 0) {
			}
			close(fd);
		}
		closedir(dir);
	} else {
	}
	while (waitpid(-1, status, __WALL) != pid) {
	}
}

static void setup_test()
{
	prctl(PR_SET_PDEATHSIG, SIGKILL, 0, 0, 0);
	setpgrp();
	write_file("/proc/self/oom_score_adj", "1000");
}

static void execute_one(void);

#define WAIT_FLAGS __WALL

static void loop(void)
{
	int iter = 0;
	for (;; iter++) {
		int pid = fork();
		if (pid < 0)
	exit(1);
		if (pid == 0) {
			setup_test();
			execute_one();
			exit(0);
		}
		int status = 0;
		uint64_t start = current_time_ms();
		for (;;) {
			if (waitpid(-1, &status, WNOHANG | WAIT_FLAGS) == pid)
				break;
			sleep_ms(1);
		if (current_time_ms() - start < 5000) {
			continue;
		}
			kill_and_wait(pid, &status);
			break;
		}
	}
}

void execute_one(void)
{
*(uint32_t*)0x20000380 = 2;
*(uint32_t*)0x20000384 = 0x70;
*(uint8_t*)0x20000388 = 1;
*(uint8_t*)0x20000389 = 0;
*(uint8_t*)0x2000038a = 0;
*(uint8_t*)0x2000038b = 0;
*(uint32_t*)0x2000038c = 0;
*(uint64_t*)0x20000390 = 0;
*(uint64_t*)0x20000398 = 0;
*(uint64_t*)0x200003a0 = 0;
STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 0, 1);
STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 1, 1);
STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 2, 1);
STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 3, 1);
STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 4, 1);
STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 5, 1);
STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 6, 1);
STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 7, 1);
STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 8, 1);
STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 9, 1);
STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 10, 1);
STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 11, 1);
STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 12, 1);
STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 13, 1);
STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 14, 1);
STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 15, 2);
STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 17, 1);
STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 18, 1);
STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 19, 1);
STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 20, 1);
STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 21, 1);
STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 22, 1);
STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 23, 1);
STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 24, 1);
STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 25, 1);
STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 26, 1);
STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 27, 1);
STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 28, 1);
STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 29, 35);
*(uint32_t*)0x200003b0 = 0;
*(uint32_t*)0x200003b4 = 0;
*(uint64_t*)0x200003b8 = 0;
*(uint64_t*)0x200003c0 = 0;
*(uint64_t*)0x200003c8 = 0;
*(uint64_t*)0x200003d0 = 0;
*(uint32_t*)0x200003d8 = 0;
*(uint32_t*)0x200003dc = 0;
*(uint64_t*)0x200003e0 = 0;
*(uint32_t*)0x200003e8 = 0;
*(uint16_t*)0x200003ec = 0;
*(uint16_t*)0x200003ee = 0;
	syscall(__NR_perf_event_open, 0x20000380ul, -1, 0ul, -1, 0ul);
*(uint32_t*)0x20000080 = 0;
*(uint32_t*)0x20000084 = 0x70;
*(uint8_t*)0x20000088 = 0;
*(uint8_t*)0x20000089 = 0;
*(uint8_t*)0x2000008a = 0;
*(uint8_t*)0x2000008b = 0;
*(uint32_t*)0x2000008c = 0;
*(uint64_t*)0x20000090 = 0x9c;
*(uint64_t*)0x20000098 = 0;
*(uint64_t*)0x200000a0 = 0;
STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 0, 1);
STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 1, 1);
STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 2, 1);
STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 3, 1);
STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 4, 1);
STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 5, 1);
STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 6, 1);
STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 7, 1);
STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 8, 1);
STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 9, 1);
STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 10, 1);
STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 11, 1);
STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 12, 1);
STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 13, 1);
STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 14, 1);
STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 15, 2);
STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 17, 1);
STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 18, 1);
STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 19, 1);
STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 20, 1);
STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 21, 1);
STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 22, 1);
STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 23, 1);
STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 24, 1);
STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 25, 1);
STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 26, 1);
STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 27, 1);
STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 28, 1);
STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 29, 35);
*(uint32_t*)0x200000b0 = 0;
*(uint32_t*)0x200000b4 = 0;
*(uint64_t*)0x200000b8 = 0;
*(uint64_t*)0x200000c0 = 0;
*(uint64_t*)0x200000c8 = 0;
*(uint64_t*)0x200000d0 = 0;
*(uint32_t*)0x200000d8 = 0;
*(uint32_t*)0x200000dc = 0;
*(uint64_t*)0x200000e0 = 0;
*(uint32_t*)0x200000e8 = 0;
*(uint16_t*)0x200000ec = 0;
*(uint16_t*)0x200000ee = 0;
	syscall(__NR_perf_event_open, 0x20000080ul, -1, 0ul, -1, 0ul);
*(uint32_t*)0x20000140 = 2;
*(uint32_t*)0x20000144 = 0x70;
*(uint8_t*)0x20000148 = 0x47;
*(uint8_t*)0x20000149 = 1;
*(uint8_t*)0x2000014a = 0;
*(uint8_t*)0x2000014b = 0;
*(uint32_t*)0x2000014c = 0;
*(uint64_t*)0x20000150 = 9;
*(uint64_t*)0x20000158 = 0x61220;
*(uint64_t*)0x20000160 = 0;
STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 0, 1);
STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 1, 1);
STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 2, 1);
STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 3, 1);
STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 4, 1);
STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 5, 1);
STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 6, 1);
STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 7, 1);
STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 8, 1);
STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 9, 1);
STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 10, 1);
STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 11, 1);
STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 12, 1);
STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 13, 1);
STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 14, 1);
STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 15, 2);
STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 17, 1);
STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 18, 1);
STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 19, 1);
STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 20, 1);
STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 21, 1);
STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 22, 1);
STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 23, 1);
STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 24, 1);
STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 25, 1);
STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 26, 1);
STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 27, 1);
STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 28, 1);
STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 29, 35);
*(uint32_t*)0x20000170 = 0;
*(uint32_t*)0x20000174 = 0;
*(uint64_t*)0x20000178 = 0;
*(uint64_t*)0x20000180 = 0;
*(uint64_t*)0x20000188 = 0;
*(uint64_t*)0x20000190 = 1;
*(uint32_t*)0x20000198 = 0;
*(uint32_t*)0x2000019c = 0;
*(uint64_t*)0x200001a0 = 2;
*(uint32_t*)0x200001a8 = 0;
*(uint16_t*)0x200001ac = 0;
*(uint16_t*)0x200001ae = 0;
	syscall(__NR_perf_event_open, 0x20000140ul, 0, -1ul, -1, 0ul);

}
int main(void)
{
		syscall(__NR_mmap, 0x1ffff000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul);
	syscall(__NR_mmap, 0x20000000ul, 0x1000000ul, 7ul, 0x32ul, -1, 0ul);
	syscall(__NR_mmap, 0x21000000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul);
			loop();
	return 0;
}

Regards,
Michael Wang


On 2021/9/13 下午10:49, Dave Hansen wrote:
> On 9/12/21 8:30 PM, 王贇 wrote:
>> According to the trace we know the story is like this, the NMI
>> triggered perf IRQ throttling and call perf_log_throttle(),
>> which triggered the swevent overflow, and the overflow process
>> do perf_callchain_user() which triggered a user PF, and the PF
>> process triggered perf ftrace which finally lead into a suspected
>> stack overflow.
>>
>> This patch disable ftrace on fault.c, which help to avoid the panic.
> ...
>> +# Disable ftrace to avoid stack overflow.
>> +CFLAGS_REMOVE_fault.o = $(CC_FLAGS_FTRACE)
> 
> Was this observed on a mainline kernel?
> 
> How reproducible is this?
> 
> I suspect we're going into do_user_addr_fault(), then falling in here:
> 
>>         if (unlikely(faulthandler_disabled() || !mm)) {
>>                 bad_area_nosemaphore(regs, error_code, address);
>>                 return;
>>         }
> 
> Then something double faults in perf_swevent_get_recursion_context().
> But, you snipped all of the register dump out so I can't quite see
> what's going on and what might have caused *that* fault.  But, in my
> kernel perf_swevent_get_recursion_context+0x0/0x70 is:
> 
> 	   mov    $0x27d00,%rdx
> 
> which is rather unlikely to fault.
> 
> Either way, we don't want to keep ftrace out of fault.c.  This patch is
> just a hack, and doesn't really try to fix the underlying problem.  This
> situation *should* be handled today.  There's code there to handle it.
> 
> Something else really funky is going on.
>
王贇 Sept. 14, 2021, 2:08 a.m. UTC | #3
On 2021/9/13 下午10:49, Dave Hansen wrote:
> On 9/12/21 8:30 PM, 王贇 wrote:
>> According to the trace we know the story is like this, the NMI
>> triggered perf IRQ throttling and call perf_log_throttle(),
>> which triggered the swevent overflow, and the overflow process
>> do perf_callchain_user() which triggered a user PF, and the PF
>> process triggered perf ftrace which finally lead into a suspected
>> stack overflow.
>>
>> This patch disable ftrace on fault.c, which help to avoid the panic.
> ...
>> +# Disable ftrace to avoid stack overflow.
>> +CFLAGS_REMOVE_fault.o = $(CC_FLAGS_FTRACE)
> 
> Was this observed on a mainline kernel?

Yes, it is trigger on linux-next.

> 
> How reproducible is this?
> 
> I suspect we're going into do_user_addr_fault(), then falling in here:
> 
>>         if (unlikely(faulthandler_disabled() || !mm)) {
>>                 bad_area_nosemaphore(regs, error_code, address);
>>                 return;
>>         }
> 

Correct, perf_callchain_user() disabled PF which lead into here.

> Then something double faults in perf_swevent_get_recursion_context().
> But, you snipped all of the register dump out so I can't quite see
> what's going on and what might have caused *that* fault.  But, in my
> kernel perf_swevent_get_recursion_context+0x0/0x70 is:
> 
> 	   mov    $0x27d00,%rdx
> 
> which is rather unlikely to fault.

Would you like to check the full trace I just sent see if we can get any
clue?

> 
> Either way, we don't want to keep ftrace out of fault.c.  This patch is
> just a hack, and doesn't really try to fix the underlying problem.  This
> situation *should* be handled today.  There's code there to handle it.
> 
> Something else really funky is going on.

Do you think stack overflow is possible in this case? To be mentioned the NMI
arrive in very high frequency, and reduce perf_event_max_sample_rate to a low
value can also avoid the panic.

Regards,
Michael Wang

>
王贇 Sept. 14, 2021, 3:02 a.m. UTC | #4
On 2021/9/14 上午9:52, 王贇 wrote:
> Hi, Dave, Peter
> 
> Nice to have you guys digging the root cause, please allow me to paste whole
> trace and the way of reproduce here firstly before checking the details:
> 
> Below is the full trace, triggered with the latest linux-next master branch:

After recheck I found the log is from linux repo not linux-next, below is from the
linux-next commit 24a36d3171e4 ("Add linux-next specific files for 20210913"):

[   44.106891][    C0] perf: interrupt took too long (5127 > 5062), lowering kernel.perf_event_max_sample_rate to 39000
[   44.110727][    C0] perf: interrupt took too long (10133 > 10111), lowering kernel.perf_event_max_sample_rate to 19000
[   44.114496][    C0] perf: interrupt took too long (12698 > 12666), lowering kernel.perf_event_max_sample_rate to 15000
[   44.123810][    C0] perf: interrupt took too long (16151 > 15872), lowering kernel.perf_event_max_sample_rate to 12000
[   44.128746][    C0] perf: interrupt took too long (20433 > 20188), lowering kernel.perf_event_max_sample_rate to 9000
[   44.133509][    C0] traps: PANIC: double fault, error_code: 0x0
[   44.133519][    C0] double fault: 0000 [#1] SMP PTI
[   44.133526][    C0] CPU: 0 PID: 743 Comm: a.out Not tainted 5.14.0-next-20210913 #469
[   44.133532][    C0] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[   44.133536][    C0] RIP: 0010:perf_swevent_get_recursion_context+0x0/0x70
[   44.133549][    C0] Code: 48 03 43 28 48 8b 0c 24 bb 01 00 00 00 4c 29 f0 48 39 c8 48 0f 47 c1 49 89 45 08 e9 48 ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 <55> 53 e8 09 20 f2 ff 48 c7 c2 20 4d 03 00 65 48 03 15 5a 3b d2 7e
[   44.133556][    C0] RSP: 0018:fffffe000000b000 EFLAGS: 00010046
[   44.133562][    C0] RAX: 0000000080120007 RBX: fffffe000000b050 RCX: 0000000000000000
[   44.133566][    C0] RDX: ffff888106dd8000 RSI: ffffffff81269031 RDI: 000000000000001c
[   44.133570][    C0] RBP: 000000000000001c R08: 0000000000000001 R09: 0000000000000000
[   44.133574][    C0] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[   44.133578][    C0] R13: fffffe000000b044 R14: 0000000000000001 R15: 0000000000000001
[   44.133582][    C0] FS:  00007f5f39086740(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000
[   44.133588][    C0] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   44.133593][    C0] CR2: fffffe000000aff8 CR3: 0000000105894005 CR4: 00000000003606f0
[   44.133597][    C0] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   44.133600][    C0] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   44.133604][    C0] Call Trace:
[   44.133607][    C0]  <NMI>
[   44.133610][    C0]  perf_trace_buf_alloc+0x26/0xd0
[   44.133623][    C0]  ? is_prefetch.isra.25+0x260/0x260
[   44.133631][    C0]  ? __bad_area_nosemaphore+0x1b8/0x280
[   44.133637][    C0]  perf_ftrace_function_call+0x18f/0x2e0
[   44.133649][    C0]  ? perf_trace_buf_alloc+0xbf/0xd0
[   44.133687][    C0]  ? 0xffffffffa00b0083
[   44.133714][    C0]  0xffffffffa00b0083
[   44.133733][    C0]  ? 0xffffffffa00b0083
[   44.133753][    C0]  ? kernelmode_fixup_or_oops+0x5/0x120
[   44.133773][    C0]  kernelmode_fixup_or_oops+0x5/0x120
[   44.133780][    C0]  __bad_area_nosemaphore+0x1b8/0x280
[   44.133799][    C0]  do_user_addr_fault+0x410/0x920
[   44.133815][    C0]  ? 0xffffffffa00b0083
[   44.133832][    C0]  exc_page_fault+0x92/0x300
[   44.133849][    C0]  asm_exc_page_fault+0x1e/0x30
[   44.133857][    C0] RIP: 0010:__get_user_nocheck_8+0x6/0x13
[   44.133866][    C0] Code: 01 ca c3 90 0f 01 cb 0f ae e8 0f b7 10 31 c0 0f 01 ca c3 90 0f 01 cb 0f ae e8 8b 10 31 c0 0f 01 ca c3 66 90 0f 01 cb 0f ae e8 <48> 8b 10 31 c0 0f 01 ca c3 90 0f 01 ca 31 d2 48 c7 c0 f2 ff ff ff
[   44.133872][    C0] RSP: 0018:fffffe000000b370 EFLAGS: 00050046
[   44.133877][    C0] RAX: 0000000000000000 RBX: fffffe000000b3d0 RCX: 0000000000000000
[   44.133881][    C0] RDX: ffff888106dd8000 RSI: ffffffff8100a8ee RDI: fffffe000000b3d0
[   44.133885][    C0] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
[   44.133889][    C0] R10: 0000000000000000 R11: 0000000000000014 R12: 00007fffffffeff0
[   44.133893][    C0] R13: ffff888106dd8000 R14: 000000000000007f R15: 000000000000007f
[   44.133920][    C0]  ? perf_callchain_user+0x25e/0x2f0
[   44.133940][    C0]  perf_callchain_user+0x266/0x2f0
[   44.133961][    C0]  get_perf_callchain+0x194/0x210
[   44.133992][    C0]  perf_callchain+0xa3/0xc0
[   44.134010][    C0]  perf_prepare_sample+0xa5/0xa60
[   44.134037][    C0]  perf_event_output_forward+0x7b/0x1b0
[   44.134051][    C0]  ? perf_swevent_get_recursion_context+0x62/0x70
[   44.134062][    C0]  ? perf_trace_buf_alloc+0xbf/0xd0
[   44.134080][    C0]  __perf_event_overflow+0x67/0x120
[   44.134096][    C0]  perf_swevent_overflow+0xcb/0x110
[   44.134114][    C0]  perf_swevent_event+0xb0/0xf0
[   44.134128][    C0]  perf_tp_event+0x292/0x410
[   44.134135][    C0]  ? 0xffffffffa00b0083
[   44.134170][    C0]  ? tracing_gen_ctx_irq_test+0x8f/0xc0
[   44.134179][    C0]  ? perf_swevent_event+0x28/0xf0
[   44.134192][    C0]  ? perf_tp_event+0x2d7/0x410
[   44.134200][    C0]  ? tracing_gen_ctx_irq_test+0x8f/0xc0
[   44.134208][    C0]  ? perf_swevent_event+0x28/0xf0
[   44.134221][    C0]  ? perf_tp_event+0x2d7/0x410
[   44.134230][    C0]  ? tracing_gen_ctx_irq_test+0x8f/0xc0
[   44.134250][    C0]  ? tracing_gen_ctx_irq_test+0x8f/0xc0
[   44.134257][    C0]  ? perf_swevent_event+0x28/0xf0
[   44.134284][    C0]  ? perf_trace_run_bpf_submit+0x87/0xc0
[   44.134295][    C0]  ? perf_trace_buf_alloc+0x86/0xd0
[   44.134302][    C0]  perf_trace_run_bpf_submit+0x87/0xc0
[   44.134327][    C0]  perf_trace_lock_acquire+0x12b/0x170
[   44.134360][    C0]  lock_acquire+0x1bf/0x2e0
[   44.134370][    C0]  ? perf_output_begin+0x5/0x4b0
[   44.134401][    C0]  perf_output_begin+0x70/0x4b0
[   44.134408][    C0]  ? perf_output_begin+0x5/0x4b0
[   44.134446][    C0]  perf_log_throttle+0xe2/0x1a0
[   44.134484][    C0]  ? 0xffffffffa00b0083
[   44.134500][    C0]  ? perf_event_update_userpage+0x135/0x2d0
[   44.134515][    C0]  ? 0xffffffffa00b0083
[   44.134524][    C0]  ? 0xffffffffa00b0083
[   44.134548][    C0]  ? perf_event_update_userpage+0x135/0x2d0
[   44.134559][    C0]  ? rcu_read_lock_held_common+0x5/0x40
[   44.134573][    C0]  ? rcu_read_lock_held_common+0xe/0x40
[   44.134582][    C0]  ? rcu_read_lock_sched_held+0x23/0x80
[   44.134593][    C0]  ? lock_release+0xc7/0x2b0
[   44.134615][    C0]  ? __perf_event_account_interrupt+0x116/0x160
[   44.134631][    C0]  __perf_event_account_interrupt+0x116/0x160
[   44.134644][    C0]  __perf_event_overflow+0x3e/0x120
[   44.134660][    C0]  handle_pmi_common+0x30f/0x400
[   44.134666][    C0]  ? perf_ftrace_function_call+0x268/0x2e0
[   44.134676][    C0]  ? perf_ftrace_function_call+0x53/0x2e0
[   44.134719][    C0]  ? 0xffffffffa00b0083
[   44.134745][    C0]  ? 0xffffffffa00b0083
[   44.134789][    C0]  ? intel_pmu_handle_irq+0x120/0x620
[   44.134798][    C0]  ? handle_pmi_common+0x5/0x400
[   44.134804][    C0]  intel_pmu_handle_irq+0x120/0x620
[   44.134828][    C0]  perf_event_nmi_handler+0x30/0x50
[   44.134840][    C0]  nmi_handle+0xba/0x2a0
[   44.134866][    C0]  default_do_nmi+0x45/0xf0
[   44.134878][    C0]  exc_nmi+0x155/0x170
[   44.134895][    C0]  end_repeat_nmi+0x16/0x55
[   44.134903][    C0] RIP: 0010:__sanitizer_cov_trace_pc+0x7/0x60
[   44.134912][    C0] Code: c0 81 e2 00 01 ff 00 75 10 65 48 8b 04 25 c0 71 01 00 48 8b 80 90 15 00 00 f3 c3 0f 1f 84 00 00 00 00 00 65 8b 05 89 76 e0 7e <89> c1 48 8b 34 24 65 48 8b 14 25 c0 71 01 00 81 e1 00 01 00 00 a9
[   44.134917][    C0] RSP: 0000:ffffc90000003dd0 EFLAGS: 00000046
[   44.134923][    C0] RAX: 0000000080010003 RBX: ffffffff82a1db40 RCX: 0000000000000000
[   44.134927][    C0] RDX: ffff888106dd8000 RSI: ffffffff810122fa RDI: 0000000000000000
[   44.134931][    C0] RBP: ffff88813bc41f58 R08: ffff888106dd8a68 R09: 00000000fffffffe
[   44.134934][    C0] R10: ffffc90000003be0 R11: 00000000ffd03bc8 R12: ffff88813bc118a0
[   44.134938][    C0] R13: ffff88813bc41e50 R14: 0000000000000000 R15: ffffffff82a1db40
[   44.134966][    C0]  ? __intel_pmu_enable_all.constprop.47+0x6a/0x100
[   44.134987][    C0]  ? __sanitizer_cov_trace_pc+0x7/0x60
[   44.135005][    C0]  ? kcov_common_handle+0x30/0x30
[   44.135019][    C0]  </NMI>
[   44.135021][    C0] WARNING: stack recursion on stack type 6
[   44.135024][    C0] Modules linked in:
[   44.252321][    C0] ---[ end trace 74f641c0b984aec5 ]---
[   44.252325][    C0] RIP: 0010:perf_swevent_get_recursion_context+0x0/0x70
[   44.252335][    C0] Code: 48 03 43 28 48 8b 0c 24 bb 01 00 00 00 4c 29 f0 48 39 c8 48 0f 47 c1 49 89 45 08 e9 48 ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 <55> 53 e8 09 20 f2 ff 48 c7 c2 20 4d 03 00 65 48 03 15 5a 3b d2 7e
[   44.252341][    C0] RSP: 0018:fffffe000000b000 EFLAGS: 00010046
[   44.252347][    C0] RAX: 0000000080120007 RBX: fffffe000000b050 RCX: 0000000000000000
[   44.252351][    C0] RDX: ffff888106dd8000 RSI: ffffffff81269031 RDI: 000000000000001c
[   44.252355][    C0] RBP: 000000000000001c R08: 0000000000000001 R09: 0000000000000000
[   44.252358][    C0] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[   44.252362][    C0] R13: fffffe000000b044 R14: 0000000000000001 R15: 0000000000000001
[   44.252366][    C0] FS:  00007f5f39086740(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000
[   44.252373][    C0] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   44.252377][    C0] CR2: fffffe000000aff8 CR3: 0000000105894005 CR4: 00000000003606f0
[   44.252381][    C0] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   44.252384][    C0] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   44.252389][    C0] Kernel panic - not syncing: Fatal exception in interrupt
[   44.252783][    C0] Kernel Offset: disabled






> 
> [   58.999453][    C0] traps: PANIC: double fault, error_code: 0x0
> [   58.999472][    C0] double fault: 0000 [#1] SMP PTI
> [   58.999478][    C0] CPU: 0 PID: 799 Comm: a.out Not tainted 5.14.0+ #107
> [   58.999485][    C0] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
> [   58.999488][    C0] RIP: 0010:perf_swevent_get_recursion_context+0x0/0x70
> [   58.999505][    C0] Code: 48 03 43 28 48 8b 0c 24 bb 01 00 00 00 4c 29 f0 48 39 c8 48 0f 47 c1 49 89 45 08 e9 48 ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 <55> 53 e8 89 18 f2 ff 48 c7 c2 20 4d 03 00 65 48 03 15 5a 34 d2 7e
> [   58.999511][    C0] RSP: 0018:fffffe000000b000 EFLAGS: 00010046
> [   58.999517][    C0] RAX: 0000000080120005 RBX: fffffe000000b050 RCX: 0000000000000000
> [   58.999522][    C0] RDX: ffff888106f5a180 RSI: ffffffff812696d1 RDI: 000000000000001c
> [   58.999526][    C0] RBP: 000000000000001c R08: 0000000000000001 R09: 0000000000000000
> [   58.999530][    C0] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> [   58.999533][    C0] R13: fffffe000000b044 R14: 0000000000000001 R15: 0000000000000001
> [   58.999537][    C0] FS:  00007f21fc62c740(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000
> [   58.999543][    C0] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   58.999547][    C0] CR2: fffffe000000aff8 CR3: 0000000106e2e001 CR4: 00000000003606f0
> [   58.999551][    C0] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [   58.999555][    C0] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [   58.999559][    C0] Call Trace:
> [   58.999562][    C0]  <NMI>
> [   58.999565][    C0]  perf_trace_buf_alloc+0x26/0xd0
> [   58.999579][    C0]  ? is_prefetch.isra.25+0x260/0x260
> [   58.999586][    C0]  ? __bad_area_nosemaphore+0x1b8/0x280
> [   58.999592][    C0]  perf_ftrace_function_call+0x18f/0x2e0
> [   58.999604][    C0]  ? perf_trace_buf_alloc+0xbf/0xd0
> [   58.999642][    C0]  ? 0xffffffffa00ba083
> [   58.999669][    C0]  0xffffffffa00ba083
> [   58.999688][    C0]  ? 0xffffffffa00ba083
> [   58.999708][    C0]  ? kernelmode_fixup_or_oops+0x5/0x120
> [   58.999721][    C0]  kernelmode_fixup_or_oops+0x5/0x120
> [   58.999728][    C0]  __bad_area_nosemaphore+0x1b8/0x280
> [   58.999747][    C0]  do_user_addr_fault+0x410/0x920
> [   58.999763][    C0]  ? 0xffffffffa00ba083
> [   58.999780][    C0]  exc_page_fault+0x92/0x300
> [   58.999796][    C0]  asm_exc_page_fault+0x1e/0x30
> [   58.999805][    C0] RIP: 0010:__get_user_nocheck_8+0x6/0x13
> [   58.999814][    C0] Code: 01 ca c3 90 0f 01 cb 0f ae e8 0f b7 10 31 c0 0f 01 ca c3 90 0f 01 cb 0f ae e8 8b 10 31 c0 0f 01 ca c3 66 90 0f 01 cb 0f ae e8 <48> 8b 10 31 c0 0f 01 ca c3 90 0f 01 ca 31 d2 48 c7 c0 f2 ff ff ff
> [   58.999819][    C0] RSP: 0018:fffffe000000b370 EFLAGS: 00050046
> [   58.999825][    C0] RAX: 0000000000000000 RBX: fffffe000000b3d0 RCX: 0000000000000000
> [   58.999828][    C0] RDX: ffff888106f5a180 RSI: ffffffff8100a91e RDI: fffffe000000b3d0
> [   58.999832][    C0] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
> [   58.999836][    C0] R10: 0000000000000000 R11: 0000000000000014 R12: 00007fffffffeff0
> [   58.999839][    C0] R13: ffff888106f5a180 R14: 000000000000007f R15: 000000000000007f
> [   58.999867][    C0]  ? perf_callchain_user+0x25e/0x2f0
> [   58.999886][    C0]  perf_callchain_user+0x266/0x2f0
> [   58.999907][    C0]  get_perf_callchain+0x194/0x210
> [   58.999938][    C0]  perf_callchain+0xa3/0xc0
> [   58.999956][    C0]  perf_prepare_sample+0xa5/0xa60
> [   58.999984][    C0]  perf_event_output_forward+0x7b/0x1b0
> [   58.999996][    C0]  ? perf_swevent_get_recursion_context+0x62/0x70
> [   59.000008][    C0]  ? perf_trace_buf_alloc+0xbf/0xd0
> [   59.000026][    C0]  __perf_event_overflow+0x67/0x120
> [   59.000042][    C0]  perf_swevent_overflow+0xcb/0x110
> [   59.000065][    C0]  perf_swevent_event+0xb0/0xf0
> [   59.000078][    C0]  perf_tp_event+0x292/0x410
> [   59.000085][    C0]  ? 0xffffffffa00ba083
> [   59.000120][    C0]  ? tracing_gen_ctx_irq_test+0x8f/0xa0
> [   59.000129][    C0]  ? perf_swevent_event+0x28/0xf0
> [   59.000142][    C0]  ? perf_tp_event+0x2d7/0x410
> [   59.000150][    C0]  ? tracing_gen_ctx_irq_test+0x8f/0xa0
> [   59.000157][    C0]  ? perf_swevent_event+0x28/0xf0
> [   59.000171][    C0]  ? perf_tp_event+0x2d7/0x410
> [   59.000179][    C0]  ? tracing_gen_ctx_irq_test+0x8f/0xa0
> [   59.000198][    C0]  ? tracing_gen_ctx_irq_test+0x8f/0xa0
> [   59.000206][    C0]  ? perf_swevent_event+0x28/0xf0
> [   59.000233][    C0]  ? perf_trace_run_bpf_submit+0x87/0xc0
> [   59.000244][    C0]  ? perf_trace_buf_alloc+0x86/0xd0
> [   59.000250][    C0]  perf_trace_run_bpf_submit+0x87/0xc0
> [   59.000276][    C0]  perf_trace_lock_acquire+0x12b/0x170
> [   59.000308][    C0]  lock_acquire+0x1bf/0x2e0
> [   59.000317][    C0]  ? perf_output_begin+0x5/0x4b0
> [   59.000348][    C0]  perf_output_begin+0x70/0x4b0
> [   59.000356][    C0]  ? perf_output_begin+0x5/0x4b0
> [   59.000394][    C0]  perf_log_throttle+0xe2/0x1a0
> [   59.000431][    C0]  ? 0xffffffffa00ba083
> [   59.000447][    C0]  ? perf_event_update_userpage+0x135/0x2d0
> [   59.000462][    C0]  ? 0xffffffffa00ba083
> [   59.000471][    C0]  ? 0xffffffffa00ba083
> [   59.000495][    C0]  ? perf_event_update_userpage+0x135/0x2d0
> [   59.000506][    C0]  ? rcu_read_lock_held_common+0x5/0x40
> [   59.000519][    C0]  ? rcu_read_lock_held_common+0xe/0x40
> [   59.000528][    C0]  ? rcu_read_lock_sched_held+0x23/0x80
> [   59.000539][    C0]  ? lock_release+0xc7/0x2b0
> [   59.000560][    C0]  ? __perf_event_account_interrupt+0x116/0x160
> [   59.000576][    C0]  __perf_event_account_interrupt+0x116/0x160
> [   59.000589][    C0]  __perf_event_overflow+0x3e/0x120
> [   59.000604][    C0]  handle_pmi_common+0x30f/0x400
> [   59.000611][    C0]  ? perf_ftrace_function_call+0x268/0x2e0
> [   59.000620][    C0]  ? perf_ftrace_function_call+0x53/0x2e0
> [   59.000663][    C0]  ? 0xffffffffa00ba083
> [   59.000689][    C0]  ? 0xffffffffa00ba083
> [   59.000729][    C0]  ? intel_pmu_handle_irq+0x120/0x620
> [   59.000737][    C0]  ? handle_pmi_common+0x5/0x400
> [   59.000743][    C0]  intel_pmu_handle_irq+0x120/0x620
> [   59.000767][    C0]  perf_event_nmi_handler+0x30/0x50
> [   59.000779][    C0]  nmi_handle+0xba/0x2a0
> [   59.000806][    C0]  default_do_nmi+0x45/0xf0
> [   59.000819][    C0]  exc_nmi+0x155/0x170
> [   59.000838][    C0]  end_repeat_nmi+0x16/0x55
> [   59.000845][    C0] RIP: 0010:__sanitizer_cov_trace_pc+0xd/0x60
> [   59.000853][    C0] Code: 00 75 10 65 48 8b 04 25 c0 71 01 00 48 8b 80 88 15 00 00 f3 c3 0f 1f 84 00 00 00 00 00 65 8b 05 09 77 e0 7e 89 c1 48 8b 34 24 <65> 48 8b 14 25 c0 71 01 00 81 e1 00 01 00 00 a9 00 01 ff 00 74 10
> [   59.000858][    C0] RSP: 0000:ffffc90000003dd0 EFLAGS: 00000046
> [   59.000863][    C0] RAX: 0000000080010001 RBX: ffffffff82a1db40 RCX: 0000000080010001
> [   59.000867][    C0] RDX: ffff888106f5a180 RSI: ffffffff81009613 RDI: 0000000000000000
> [   59.000871][    C0] RBP: ffff88813bc40d08 R08: ffff888106f5abb8 R09: 00000000fffffffe
> [   59.000875][    C0] R10: ffffc90000003be0 R11: 00000000ffd17b4b R12: ffff88813bc118a0
> [   59.000878][    C0] R13: ffff88813bc40c00 R14: 0000000000000000 R15: ffffffff82a1db40
> [   59.000906][    C0]  ? x86_pmu_enable+0x383/0x440
> [   59.000924][    C0]  ? __sanitizer_cov_trace_pc+0xd/0x60
> [   59.000942][    C0]  ? intel_pmu_handle_irq+0x284/0x620
> [   59.000954][    C0]  </NMI>
> [   59.000957][    C0] WARNING: stack recursion on stack type 6
> [   59.000960][    C0] Modules linked in:
> [   59.120070][    C0] ---[ end trace 07eb1e3908914794 ]---
> [   59.120075][    C0] RIP: 0010:perf_swevent_get_recursion_context+0x0/0x70
> [   59.120087][    C0] Code: 48 03 43 28 48 8b 0c 24 bb 01 00 00 00 4c 29 f0 48 39 c8 48 0f 47 c1 49 89 45 08 e9 48 ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 <55> 53 e8 89 18 f2 ff 48 c7 c2 20 4d 03 00 65 48 03 15 5a 34 d2 7e
> [   59.120092][    C0] RSP: 0018:fffffe000000b000 EFLAGS: 00010046
> [   59.120098][    C0] RAX: 0000000080120005 RBX: fffffe000000b050 RCX: 0000000000000000
> [   59.120102][    C0] RDX: ffff888106f5a180 RSI: ffffffff812696d1 RDI: 000000000000001c
> [   59.120106][    C0] RBP: 000000000000001c R08: 0000000000000001 R09: 0000000000000000
> [   59.120110][    C0] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> [   59.120114][    C0] R13: fffffe000000b044 R14: 0000000000000001 R15: 0000000000000001
> [   59.120118][    C0] FS:  00007f21fc62c740(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000
> [   59.120125][    C0] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   59.120129][    C0] CR2: fffffe000000aff8 CR3: 0000000106e2e001 CR4: 00000000003606f0
> [   59.120133][    C0] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [   59.120137][    C0] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [   59.120141][    C0] Kernel panic - not syncing: Fatal exception in interrupt
> [   59.120540][    C0] Kernel Offset: disabled
> 
> And below is the way of reproduce:
> 
> 
> // autogenerated by syzkaller (https://github.com/google/syzkaller)
> 
> #define _GNU_SOURCE
> 
> #include <dirent.h>
> #include <endian.h>
> #include <errno.h>
> #include <fcntl.h>
> #include <signal.h>
> #include <stdarg.h>
> #include <stdbool.h>
> #include <stdint.h>
> #include <stdio.h>
> #include <stdlib.h>
> #include <string.h>
> #include <sys/prctl.h>
> #include <sys/stat.h>
> #include <sys/syscall.h>
> #include <sys/types.h>
> #include <sys/wait.h>
> #include <time.h>
> #include <unistd.h>
> 
> static void sleep_ms(uint64_t ms)
> {
> 	usleep(ms * 1000);
> }
> 
> static uint64_t current_time_ms(void)
> {
> 	struct timespec ts;
> 	if (clock_gettime(CLOCK_MONOTONIC, &ts))
> 	exit(1);
> 	return (uint64_t)ts.tv_sec * 1000 + (uint64_t)ts.tv_nsec / 1000000;
> }
> 
> #define BITMASK(bf_off,bf_len) (((1ull << (bf_len)) - 1) << (bf_off))
> #define STORE_BY_BITMASK(type,htobe,addr,val,bf_off,bf_len) *(type*)(addr) = htobe((htobe(*(type*)(addr)) & ~BITMASK((bf_off), (bf_len))) | (((type)(val) << (bf_off)) & BITMASK((bf_off), (bf_len))))
> 
> static bool write_file(const char* file, const char* what, ...)
> {
> 	char buf[1024];
> 	va_list args;
> 	va_start(args, what);
> 	vsnprintf(buf, sizeof(buf), what, args);
> 	va_end(args);
> 	buf[sizeof(buf) - 1] = 0;
> 	int len = strlen(buf);
> 	int fd = open(file, O_WRONLY | O_CLOEXEC);
> 	if (fd == -1)
> 		return false;
> 	if (write(fd, buf, len) != len) {
> 		int err = errno;
> 		close(fd);
> 		errno = err;
> 		return false;
> 	}
> 	close(fd);
> 	return true;
> }
> 
> static void kill_and_wait(int pid, int* status)
> {
> 	kill(-pid, SIGKILL);
> 	kill(pid, SIGKILL);
> 	for (int i = 0; i < 100; i++) {
> 		if (waitpid(-1, status, WNOHANG | __WALL) == pid)
> 			return;
> 		usleep(1000);
> 	}
> 	DIR* dir = opendir("/sys/fs/fuse/connections");
> 	if (dir) {
> 		for (;;) {
> 			struct dirent* ent = readdir(dir);
> 			if (!ent)
> 				break;
> 			if (strcmp(ent->d_name, ".") == 0 || strcmp(ent->d_name, "..") == 0)
> 				continue;
> 			char abort[300];
> 			snprintf(abort, sizeof(abort), "/sys/fs/fuse/connections/%s/abort", ent->d_name);
> 			int fd = open(abort, O_WRONLY);
> 			if (fd == -1) {
> 				continue;
> 			}
> 			if (write(fd, abort, 1) < 0) {
> 			}
> 			close(fd);
> 		}
> 		closedir(dir);
> 	} else {
> 	}
> 	while (waitpid(-1, status, __WALL) != pid) {
> 	}
> }
> 
> static void setup_test()
> {
> 	prctl(PR_SET_PDEATHSIG, SIGKILL, 0, 0, 0);
> 	setpgrp();
> 	write_file("/proc/self/oom_score_adj", "1000");
> }
> 
> static void execute_one(void);
> 
> #define WAIT_FLAGS __WALL
> 
> static void loop(void)
> {
> 	int iter = 0;
> 	for (;; iter++) {
> 		int pid = fork();
> 		if (pid < 0)
> 	exit(1);
> 		if (pid == 0) {
> 			setup_test();
> 			execute_one();
> 			exit(0);
> 		}
> 		int status = 0;
> 		uint64_t start = current_time_ms();
> 		for (;;) {
> 			if (waitpid(-1, &status, WNOHANG | WAIT_FLAGS) == pid)
> 				break;
> 			sleep_ms(1);
> 		if (current_time_ms() - start < 5000) {
> 			continue;
> 		}
> 			kill_and_wait(pid, &status);
> 			break;
> 		}
> 	}
> }
> 
> void execute_one(void)
> {
> *(uint32_t*)0x20000380 = 2;
> *(uint32_t*)0x20000384 = 0x70;
> *(uint8_t*)0x20000388 = 1;
> *(uint8_t*)0x20000389 = 0;
> *(uint8_t*)0x2000038a = 0;
> *(uint8_t*)0x2000038b = 0;
> *(uint32_t*)0x2000038c = 0;
> *(uint64_t*)0x20000390 = 0;
> *(uint64_t*)0x20000398 = 0;
> *(uint64_t*)0x200003a0 = 0;
> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 0, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 1, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 2, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 3, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 4, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 5, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 6, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 7, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 8, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 9, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 10, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 11, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 12, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 13, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 14, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 15, 2);
> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 17, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 18, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 19, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 20, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 21, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 22, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 23, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 24, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 25, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 26, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 27, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 28, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 29, 35);
> *(uint32_t*)0x200003b0 = 0;
> *(uint32_t*)0x200003b4 = 0;
> *(uint64_t*)0x200003b8 = 0;
> *(uint64_t*)0x200003c0 = 0;
> *(uint64_t*)0x200003c8 = 0;
> *(uint64_t*)0x200003d0 = 0;
> *(uint32_t*)0x200003d8 = 0;
> *(uint32_t*)0x200003dc = 0;
> *(uint64_t*)0x200003e0 = 0;
> *(uint32_t*)0x200003e8 = 0;
> *(uint16_t*)0x200003ec = 0;
> *(uint16_t*)0x200003ee = 0;
> 	syscall(__NR_perf_event_open, 0x20000380ul, -1, 0ul, -1, 0ul);
> *(uint32_t*)0x20000080 = 0;
> *(uint32_t*)0x20000084 = 0x70;
> *(uint8_t*)0x20000088 = 0;
> *(uint8_t*)0x20000089 = 0;
> *(uint8_t*)0x2000008a = 0;
> *(uint8_t*)0x2000008b = 0;
> *(uint32_t*)0x2000008c = 0;
> *(uint64_t*)0x20000090 = 0x9c;
> *(uint64_t*)0x20000098 = 0;
> *(uint64_t*)0x200000a0 = 0;
> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 0, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 1, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 2, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 3, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 4, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 5, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 6, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 7, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 8, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 9, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 10, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 11, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 12, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 13, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 14, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 15, 2);
> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 17, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 18, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 19, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 20, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 21, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 22, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 23, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 24, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 25, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 26, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 27, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 28, 1);
> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 29, 35);
> *(uint32_t*)0x200000b0 = 0;
> *(uint32_t*)0x200000b4 = 0;
> *(uint64_t*)0x200000b8 = 0;
> *(uint64_t*)0x200000c0 = 0;
> *(uint64_t*)0x200000c8 = 0;
> *(uint64_t*)0x200000d0 = 0;
> *(uint32_t*)0x200000d8 = 0;
> *(uint32_t*)0x200000dc = 0;
> *(uint64_t*)0x200000e0 = 0;
> *(uint32_t*)0x200000e8 = 0;
> *(uint16_t*)0x200000ec = 0;
> *(uint16_t*)0x200000ee = 0;
> 	syscall(__NR_perf_event_open, 0x20000080ul, -1, 0ul, -1, 0ul);
> *(uint32_t*)0x20000140 = 2;
> *(uint32_t*)0x20000144 = 0x70;
> *(uint8_t*)0x20000148 = 0x47;
> *(uint8_t*)0x20000149 = 1;
> *(uint8_t*)0x2000014a = 0;
> *(uint8_t*)0x2000014b = 0;
> *(uint32_t*)0x2000014c = 0;
> *(uint64_t*)0x20000150 = 9;
> *(uint64_t*)0x20000158 = 0x61220;
> *(uint64_t*)0x20000160 = 0;
> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 0, 1);
> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 1, 1);
> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 2, 1);
> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 3, 1);
> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 4, 1);
> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 5, 1);
> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 6, 1);
> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 7, 1);
> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 8, 1);
> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 9, 1);
> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 10, 1);
> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 11, 1);
> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 12, 1);
> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 13, 1);
> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 14, 1);
> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 15, 2);
> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 17, 1);
> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 18, 1);
> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 19, 1);
> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 20, 1);
> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 21, 1);
> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 22, 1);
> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 23, 1);
> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 24, 1);
> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 25, 1);
> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 26, 1);
> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 27, 1);
> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 28, 1);
> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 29, 35);
> *(uint32_t*)0x20000170 = 0;
> *(uint32_t*)0x20000174 = 0;
> *(uint64_t*)0x20000178 = 0;
> *(uint64_t*)0x20000180 = 0;
> *(uint64_t*)0x20000188 = 0;
> *(uint64_t*)0x20000190 = 1;
> *(uint32_t*)0x20000198 = 0;
> *(uint32_t*)0x2000019c = 0;
> *(uint64_t*)0x200001a0 = 2;
> *(uint32_t*)0x200001a8 = 0;
> *(uint16_t*)0x200001ac = 0;
> *(uint16_t*)0x200001ae = 0;
> 	syscall(__NR_perf_event_open, 0x20000140ul, 0, -1ul, -1, 0ul);
> 
> }
> int main(void)
> {
> 		syscall(__NR_mmap, 0x1ffff000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul);
> 	syscall(__NR_mmap, 0x20000000ul, 0x1000000ul, 7ul, 0x32ul, -1, 0ul);
> 	syscall(__NR_mmap, 0x21000000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul);
> 			loop();
> 	return 0;
> }
> 
> Regards,
> Michael Wang
> 
> 
> On 2021/9/13 下午10:49, Dave Hansen wrote:
>> On 9/12/21 8:30 PM, 王贇 wrote:
>>> According to the trace we know the story is like this, the NMI
>>> triggered perf IRQ throttling and call perf_log_throttle(),
>>> which triggered the swevent overflow, and the overflow process
>>> do perf_callchain_user() which triggered a user PF, and the PF
>>> process triggered perf ftrace which finally lead into a suspected
>>> stack overflow.
>>>
>>> This patch disable ftrace on fault.c, which help to avoid the panic.
>> ...
>>> +# Disable ftrace to avoid stack overflow.
>>> +CFLAGS_REMOVE_fault.o = $(CC_FLAGS_FTRACE)
>>
>> Was this observed on a mainline kernel?
>>
>> How reproducible is this?
>>
>> I suspect we're going into do_user_addr_fault(), then falling in here:
>>
>>>         if (unlikely(faulthandler_disabled() || !mm)) {
>>>                 bad_area_nosemaphore(regs, error_code, address);
>>>                 return;
>>>         }
>>
>> Then something double faults in perf_swevent_get_recursion_context().
>> But, you snipped all of the register dump out so I can't quite see
>> what's going on and what might have caused *that* fault.  But, in my
>> kernel perf_swevent_get_recursion_context+0x0/0x70 is:
>>
>> 	   mov    $0x27d00,%rdx
>>
>> which is rather unlikely to fault.
>>
>> Either way, we don't want to keep ftrace out of fault.c.  This patch is
>> just a hack, and doesn't really try to fix the underlying problem.  This
>> situation *should* be handled today.  There's code there to handle it.
>>
>> Something else really funky is going on.
>>
王贇 Sept. 14, 2021, 7:23 a.m. UTC | #5
On 2021/9/14 上午11:02, 王贇 wrote:
[snip]
> [   44.133509][    C0] traps: PANIC: double fault, error_code: 0x0
> [   44.133519][    C0] double fault: 0000 [#1] SMP PTI
> [   44.133526][    C0] CPU: 0 PID: 743 Comm: a.out Not tainted 5.14.0-next-20210913 #469
> [   44.133532][    C0] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
> [   44.133536][    C0] RIP: 0010:perf_swevent_get_recursion_context+0x0/0x70
> [   44.133549][    C0] Code: 48 03 43 28 48 8b 0c 24 bb 01 00 00 00 4c 29 f0 48 39 c8 48 0f 47 c1 49 89 45 08 e9 48 ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 <55> 53 e8 09 20 f2 ff 48 c7 c2 20 4d 03 00 65 48 03 15 5a 3b d2 7e
> [   44.133556][    C0] RSP: 0018:fffffe000000b000 EFLAGS: 00010046

Another information is that I have printed '__this_cpu_ist_bottom_va(NMI)'
on cpu0, which is just the RSP fffffe000000b000, does this imply
we got an overflowed NMI stack?

Regards,
Michael Wang


> [   44.133562][    C0] RAX: 0000000080120007 RBX: fffffe000000b050 RCX: 0000000000000000
> [   44.133566][    C0] RDX: ffff888106dd8000 RSI: ffffffff81269031 RDI: 000000000000001c
> [   44.133570][    C0] RBP: 000000000000001c R08: 0000000000000001 R09: 0000000000000000
> [   44.133574][    C0] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> [   44.133578][    C0] R13: fffffe000000b044 R14: 0000000000000001 R15: 0000000000000001
> [   44.133582][    C0] FS:  00007f5f39086740(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000
> [   44.133588][    C0] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   44.133593][    C0] CR2: fffffe000000aff8 CR3: 0000000105894005 CR4: 00000000003606f0
> [   44.133597][    C0] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [   44.133600][    C0] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [   44.133604][    C0] Call Trace:
> [   44.133607][    C0]  <NMI>
> [   44.133610][    C0]  perf_trace_buf_alloc+0x26/0xd0
> [   44.133623][    C0]  ? is_prefetch.isra.25+0x260/0x260
> [   44.133631][    C0]  ? __bad_area_nosemaphore+0x1b8/0x280
> [   44.133637][    C0]  perf_ftrace_function_call+0x18f/0x2e0
> [   44.133649][    C0]  ? perf_trace_buf_alloc+0xbf/0xd0
> [   44.133687][    C0]  ? 0xffffffffa00b0083
> [   44.133714][    C0]  0xffffffffa00b0083
> [   44.133733][    C0]  ? 0xffffffffa00b0083
> [   44.133753][    C0]  ? kernelmode_fixup_or_oops+0x5/0x120
> [   44.133773][    C0]  kernelmode_fixup_or_oops+0x5/0x120
> [   44.133780][    C0]  __bad_area_nosemaphore+0x1b8/0x280
> [   44.133799][    C0]  do_user_addr_fault+0x410/0x920
> [   44.133815][    C0]  ? 0xffffffffa00b0083
> [   44.133832][    C0]  exc_page_fault+0x92/0x300
> [   44.133849][    C0]  asm_exc_page_fault+0x1e/0x30
> [   44.133857][    C0] RIP: 0010:__get_user_nocheck_8+0x6/0x13
> [   44.133866][    C0] Code: 01 ca c3 90 0f 01 cb 0f ae e8 0f b7 10 31 c0 0f 01 ca c3 90 0f 01 cb 0f ae e8 8b 10 31 c0 0f 01 ca c3 66 90 0f 01 cb 0f ae e8 <48> 8b 10 31 c0 0f 01 ca c3 90 0f 01 ca 31 d2 48 c7 c0 f2 ff ff ff
> [   44.133872][    C0] RSP: 0018:fffffe000000b370 EFLAGS: 00050046
> [   44.133877][    C0] RAX: 0000000000000000 RBX: fffffe000000b3d0 RCX: 0000000000000000
> [   44.133881][    C0] RDX: ffff888106dd8000 RSI: ffffffff8100a8ee RDI: fffffe000000b3d0
> [   44.133885][    C0] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
> [   44.133889][    C0] R10: 0000000000000000 R11: 0000000000000014 R12: 00007fffffffeff0
> [   44.133893][    C0] R13: ffff888106dd8000 R14: 000000000000007f R15: 000000000000007f
> [   44.133920][    C0]  ? perf_callchain_user+0x25e/0x2f0
> [   44.133940][    C0]  perf_callchain_user+0x266/0x2f0
> [   44.133961][    C0]  get_perf_callchain+0x194/0x210
> [   44.133992][    C0]  perf_callchain+0xa3/0xc0
> [   44.134010][    C0]  perf_prepare_sample+0xa5/0xa60
> [   44.134037][    C0]  perf_event_output_forward+0x7b/0x1b0
> [   44.134051][    C0]  ? perf_swevent_get_recursion_context+0x62/0x70
> [   44.134062][    C0]  ? perf_trace_buf_alloc+0xbf/0xd0
> [   44.134080][    C0]  __perf_event_overflow+0x67/0x120
> [   44.134096][    C0]  perf_swevent_overflow+0xcb/0x110
> [   44.134114][    C0]  perf_swevent_event+0xb0/0xf0
> [   44.134128][    C0]  perf_tp_event+0x292/0x410
> [   44.134135][    C0]  ? 0xffffffffa00b0083
> [   44.134170][    C0]  ? tracing_gen_ctx_irq_test+0x8f/0xc0
> [   44.134179][    C0]  ? perf_swevent_event+0x28/0xf0
> [   44.134192][    C0]  ? perf_tp_event+0x2d7/0x410
> [   44.134200][    C0]  ? tracing_gen_ctx_irq_test+0x8f/0xc0
> [   44.134208][    C0]  ? perf_swevent_event+0x28/0xf0
> [   44.134221][    C0]  ? perf_tp_event+0x2d7/0x410
> [   44.134230][    C0]  ? tracing_gen_ctx_irq_test+0x8f/0xc0
> [   44.134250][    C0]  ? tracing_gen_ctx_irq_test+0x8f/0xc0
> [   44.134257][    C0]  ? perf_swevent_event+0x28/0xf0
> [   44.134284][    C0]  ? perf_trace_run_bpf_submit+0x87/0xc0
> [   44.134295][    C0]  ? perf_trace_buf_alloc+0x86/0xd0
> [   44.134302][    C0]  perf_trace_run_bpf_submit+0x87/0xc0
> [   44.134327][    C0]  perf_trace_lock_acquire+0x12b/0x170
> [   44.134360][    C0]  lock_acquire+0x1bf/0x2e0
> [   44.134370][    C0]  ? perf_output_begin+0x5/0x4b0
> [   44.134401][    C0]  perf_output_begin+0x70/0x4b0
> [   44.134408][    C0]  ? perf_output_begin+0x5/0x4b0
> [   44.134446][    C0]  perf_log_throttle+0xe2/0x1a0
> [   44.134484][    C0]  ? 0xffffffffa00b0083
> [   44.134500][    C0]  ? perf_event_update_userpage+0x135/0x2d0
> [   44.134515][    C0]  ? 0xffffffffa00b0083
> [   44.134524][    C0]  ? 0xffffffffa00b0083
> [   44.134548][    C0]  ? perf_event_update_userpage+0x135/0x2d0
> [   44.134559][    C0]  ? rcu_read_lock_held_common+0x5/0x40
> [   44.134573][    C0]  ? rcu_read_lock_held_common+0xe/0x40
> [   44.134582][    C0]  ? rcu_read_lock_sched_held+0x23/0x80
> [   44.134593][    C0]  ? lock_release+0xc7/0x2b0
> [   44.134615][    C0]  ? __perf_event_account_interrupt+0x116/0x160
> [   44.134631][    C0]  __perf_event_account_interrupt+0x116/0x160
> [   44.134644][    C0]  __perf_event_overflow+0x3e/0x120
> [   44.134660][    C0]  handle_pmi_common+0x30f/0x400
> [   44.134666][    C0]  ? perf_ftrace_function_call+0x268/0x2e0
> [   44.134676][    C0]  ? perf_ftrace_function_call+0x53/0x2e0
> [   44.134719][    C0]  ? 0xffffffffa00b0083
> [   44.134745][    C0]  ? 0xffffffffa00b0083
> [   44.134789][    C0]  ? intel_pmu_handle_irq+0x120/0x620
> [   44.134798][    C0]  ? handle_pmi_common+0x5/0x400
> [   44.134804][    C0]  intel_pmu_handle_irq+0x120/0x620
> [   44.134828][    C0]  perf_event_nmi_handler+0x30/0x50
> [   44.134840][    C0]  nmi_handle+0xba/0x2a0
> [   44.134866][    C0]  default_do_nmi+0x45/0xf0
> [   44.134878][    C0]  exc_nmi+0x155/0x170
> [   44.134895][    C0]  end_repeat_nmi+0x16/0x55
> [   44.134903][    C0] RIP: 0010:__sanitizer_cov_trace_pc+0x7/0x60
> [   44.134912][    C0] Code: c0 81 e2 00 01 ff 00 75 10 65 48 8b 04 25 c0 71 01 00 48 8b 80 90 15 00 00 f3 c3 0f 1f 84 00 00 00 00 00 65 8b 05 89 76 e0 7e <89> c1 48 8b 34 24 65 48 8b 14 25 c0 71 01 00 81 e1 00 01 00 00 a9
> [   44.134917][    C0] RSP: 0000:ffffc90000003dd0 EFLAGS: 00000046
> [   44.134923][    C0] RAX: 0000000080010003 RBX: ffffffff82a1db40 RCX: 0000000000000000
> [   44.134927][    C0] RDX: ffff888106dd8000 RSI: ffffffff810122fa RDI: 0000000000000000
> [   44.134931][    C0] RBP: ffff88813bc41f58 R08: ffff888106dd8a68 R09: 00000000fffffffe
> [   44.134934][    C0] R10: ffffc90000003be0 R11: 00000000ffd03bc8 R12: ffff88813bc118a0
> [   44.134938][    C0] R13: ffff88813bc41e50 R14: 0000000000000000 R15: ffffffff82a1db40
> [   44.134966][    C0]  ? __intel_pmu_enable_all.constprop.47+0x6a/0x100
> [   44.134987][    C0]  ? __sanitizer_cov_trace_pc+0x7/0x60
> [   44.135005][    C0]  ? kcov_common_handle+0x30/0x30
> [   44.135019][    C0]  </NMI>
> [   44.135021][    C0] WARNING: stack recursion on stack type 6
> [   44.135024][    C0] Modules linked in:
> [   44.252321][    C0] ---[ end trace 74f641c0b984aec5 ]---
> [   44.252325][    C0] RIP: 0010:perf_swevent_get_recursion_context+0x0/0x70
> [   44.252335][    C0] Code: 48 03 43 28 48 8b 0c 24 bb 01 00 00 00 4c 29 f0 48 39 c8 48 0f 47 c1 49 89 45 08 e9 48 ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 <55> 53 e8 09 20 f2 ff 48 c7 c2 20 4d 03 00 65 48 03 15 5a 3b d2 7e
> [   44.252341][    C0] RSP: 0018:fffffe000000b000 EFLAGS: 00010046
> [   44.252347][    C0] RAX: 0000000080120007 RBX: fffffe000000b050 RCX: 0000000000000000
> [   44.252351][    C0] RDX: ffff888106dd8000 RSI: ffffffff81269031 RDI: 000000000000001c
> [   44.252355][    C0] RBP: 000000000000001c R08: 0000000000000001 R09: 0000000000000000
> [   44.252358][    C0] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> [   44.252362][    C0] R13: fffffe000000b044 R14: 0000000000000001 R15: 0000000000000001
> [   44.252366][    C0] FS:  00007f5f39086740(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000
> [   44.252373][    C0] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   44.252377][    C0] CR2: fffffe000000aff8 CR3: 0000000105894005 CR4: 00000000003606f0
> [   44.252381][    C0] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [   44.252384][    C0] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [   44.252389][    C0] Kernel panic - not syncing: Fatal exception in interrupt
> [   44.252783][    C0] Kernel Offset: disabled
> 
> 
> 
> 
> 
> 
>>
>> [   58.999453][    C0] traps: PANIC: double fault, error_code: 0x0
>> [   58.999472][    C0] double fault: 0000 [#1] SMP PTI
>> [   58.999478][    C0] CPU: 0 PID: 799 Comm: a.out Not tainted 5.14.0+ #107
>> [   58.999485][    C0] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
>> [   58.999488][    C0] RIP: 0010:perf_swevent_get_recursion_context+0x0/0x70
>> [   58.999505][    C0] Code: 48 03 43 28 48 8b 0c 24 bb 01 00 00 00 4c 29 f0 48 39 c8 48 0f 47 c1 49 89 45 08 e9 48 ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 <55> 53 e8 89 18 f2 ff 48 c7 c2 20 4d 03 00 65 48 03 15 5a 34 d2 7e
>> [   58.999511][    C0] RSP: 0018:fffffe000000b000 EFLAGS: 00010046
>> [   58.999517][    C0] RAX: 0000000080120005 RBX: fffffe000000b050 RCX: 0000000000000000
>> [   58.999522][    C0] RDX: ffff888106f5a180 RSI: ffffffff812696d1 RDI: 000000000000001c
>> [   58.999526][    C0] RBP: 000000000000001c R08: 0000000000000001 R09: 0000000000000000
>> [   58.999530][    C0] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
>> [   58.999533][    C0] R13: fffffe000000b044 R14: 0000000000000001 R15: 0000000000000001
>> [   58.999537][    C0] FS:  00007f21fc62c740(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000
>> [   58.999543][    C0] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [   58.999547][    C0] CR2: fffffe000000aff8 CR3: 0000000106e2e001 CR4: 00000000003606f0
>> [   58.999551][    C0] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> [   58.999555][    C0] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> [   58.999559][    C0] Call Trace:
>> [   58.999562][    C0]  <NMI>
>> [   58.999565][    C0]  perf_trace_buf_alloc+0x26/0xd0
>> [   58.999579][    C0]  ? is_prefetch.isra.25+0x260/0x260
>> [   58.999586][    C0]  ? __bad_area_nosemaphore+0x1b8/0x280
>> [   58.999592][    C0]  perf_ftrace_function_call+0x18f/0x2e0
>> [   58.999604][    C0]  ? perf_trace_buf_alloc+0xbf/0xd0
>> [   58.999642][    C0]  ? 0xffffffffa00ba083
>> [   58.999669][    C0]  0xffffffffa00ba083
>> [   58.999688][    C0]  ? 0xffffffffa00ba083
>> [   58.999708][    C0]  ? kernelmode_fixup_or_oops+0x5/0x120
>> [   58.999721][    C0]  kernelmode_fixup_or_oops+0x5/0x120
>> [   58.999728][    C0]  __bad_area_nosemaphore+0x1b8/0x280
>> [   58.999747][    C0]  do_user_addr_fault+0x410/0x920
>> [   58.999763][    C0]  ? 0xffffffffa00ba083
>> [   58.999780][    C0]  exc_page_fault+0x92/0x300
>> [   58.999796][    C0]  asm_exc_page_fault+0x1e/0x30
>> [   58.999805][    C0] RIP: 0010:__get_user_nocheck_8+0x6/0x13
>> [   58.999814][    C0] Code: 01 ca c3 90 0f 01 cb 0f ae e8 0f b7 10 31 c0 0f 01 ca c3 90 0f 01 cb 0f ae e8 8b 10 31 c0 0f 01 ca c3 66 90 0f 01 cb 0f ae e8 <48> 8b 10 31 c0 0f 01 ca c3 90 0f 01 ca 31 d2 48 c7 c0 f2 ff ff ff
>> [   58.999819][    C0] RSP: 0018:fffffe000000b370 EFLAGS: 00050046
>> [   58.999825][    C0] RAX: 0000000000000000 RBX: fffffe000000b3d0 RCX: 0000000000000000
>> [   58.999828][    C0] RDX: ffff888106f5a180 RSI: ffffffff8100a91e RDI: fffffe000000b3d0
>> [   58.999832][    C0] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
>> [   58.999836][    C0] R10: 0000000000000000 R11: 0000000000000014 R12: 00007fffffffeff0
>> [   58.999839][    C0] R13: ffff888106f5a180 R14: 000000000000007f R15: 000000000000007f
>> [   58.999867][    C0]  ? perf_callchain_user+0x25e/0x2f0
>> [   58.999886][    C0]  perf_callchain_user+0x266/0x2f0
>> [   58.999907][    C0]  get_perf_callchain+0x194/0x210
>> [   58.999938][    C0]  perf_callchain+0xa3/0xc0
>> [   58.999956][    C0]  perf_prepare_sample+0xa5/0xa60
>> [   58.999984][    C0]  perf_event_output_forward+0x7b/0x1b0
>> [   58.999996][    C0]  ? perf_swevent_get_recursion_context+0x62/0x70
>> [   59.000008][    C0]  ? perf_trace_buf_alloc+0xbf/0xd0
>> [   59.000026][    C0]  __perf_event_overflow+0x67/0x120
>> [   59.000042][    C0]  perf_swevent_overflow+0xcb/0x110
>> [   59.000065][    C0]  perf_swevent_event+0xb0/0xf0
>> [   59.000078][    C0]  perf_tp_event+0x292/0x410
>> [   59.000085][    C0]  ? 0xffffffffa00ba083
>> [   59.000120][    C0]  ? tracing_gen_ctx_irq_test+0x8f/0xa0
>> [   59.000129][    C0]  ? perf_swevent_event+0x28/0xf0
>> [   59.000142][    C0]  ? perf_tp_event+0x2d7/0x410
>> [   59.000150][    C0]  ? tracing_gen_ctx_irq_test+0x8f/0xa0
>> [   59.000157][    C0]  ? perf_swevent_event+0x28/0xf0
>> [   59.000171][    C0]  ? perf_tp_event+0x2d7/0x410
>> [   59.000179][    C0]  ? tracing_gen_ctx_irq_test+0x8f/0xa0
>> [   59.000198][    C0]  ? tracing_gen_ctx_irq_test+0x8f/0xa0
>> [   59.000206][    C0]  ? perf_swevent_event+0x28/0xf0
>> [   59.000233][    C0]  ? perf_trace_run_bpf_submit+0x87/0xc0
>> [   59.000244][    C0]  ? perf_trace_buf_alloc+0x86/0xd0
>> [   59.000250][    C0]  perf_trace_run_bpf_submit+0x87/0xc0
>> [   59.000276][    C0]  perf_trace_lock_acquire+0x12b/0x170
>> [   59.000308][    C0]  lock_acquire+0x1bf/0x2e0
>> [   59.000317][    C0]  ? perf_output_begin+0x5/0x4b0
>> [   59.000348][    C0]  perf_output_begin+0x70/0x4b0
>> [   59.000356][    C0]  ? perf_output_begin+0x5/0x4b0
>> [   59.000394][    C0]  perf_log_throttle+0xe2/0x1a0
>> [   59.000431][    C0]  ? 0xffffffffa00ba083
>> [   59.000447][    C0]  ? perf_event_update_userpage+0x135/0x2d0
>> [   59.000462][    C0]  ? 0xffffffffa00ba083
>> [   59.000471][    C0]  ? 0xffffffffa00ba083
>> [   59.000495][    C0]  ? perf_event_update_userpage+0x135/0x2d0
>> [   59.000506][    C0]  ? rcu_read_lock_held_common+0x5/0x40
>> [   59.000519][    C0]  ? rcu_read_lock_held_common+0xe/0x40
>> [   59.000528][    C0]  ? rcu_read_lock_sched_held+0x23/0x80
>> [   59.000539][    C0]  ? lock_release+0xc7/0x2b0
>> [   59.000560][    C0]  ? __perf_event_account_interrupt+0x116/0x160
>> [   59.000576][    C0]  __perf_event_account_interrupt+0x116/0x160
>> [   59.000589][    C0]  __perf_event_overflow+0x3e/0x120
>> [   59.000604][    C0]  handle_pmi_common+0x30f/0x400
>> [   59.000611][    C0]  ? perf_ftrace_function_call+0x268/0x2e0
>> [   59.000620][    C0]  ? perf_ftrace_function_call+0x53/0x2e0
>> [   59.000663][    C0]  ? 0xffffffffa00ba083
>> [   59.000689][    C0]  ? 0xffffffffa00ba083
>> [   59.000729][    C0]  ? intel_pmu_handle_irq+0x120/0x620
>> [   59.000737][    C0]  ? handle_pmi_common+0x5/0x400
>> [   59.000743][    C0]  intel_pmu_handle_irq+0x120/0x620
>> [   59.000767][    C0]  perf_event_nmi_handler+0x30/0x50
>> [   59.000779][    C0]  nmi_handle+0xba/0x2a0
>> [   59.000806][    C0]  default_do_nmi+0x45/0xf0
>> [   59.000819][    C0]  exc_nmi+0x155/0x170
>> [   59.000838][    C0]  end_repeat_nmi+0x16/0x55
>> [   59.000845][    C0] RIP: 0010:__sanitizer_cov_trace_pc+0xd/0x60
>> [   59.000853][    C0] Code: 00 75 10 65 48 8b 04 25 c0 71 01 00 48 8b 80 88 15 00 00 f3 c3 0f 1f 84 00 00 00 00 00 65 8b 05 09 77 e0 7e 89 c1 48 8b 34 24 <65> 48 8b 14 25 c0 71 01 00 81 e1 00 01 00 00 a9 00 01 ff 00 74 10
>> [   59.000858][    C0] RSP: 0000:ffffc90000003dd0 EFLAGS: 00000046
>> [   59.000863][    C0] RAX: 0000000080010001 RBX: ffffffff82a1db40 RCX: 0000000080010001
>> [   59.000867][    C0] RDX: ffff888106f5a180 RSI: ffffffff81009613 RDI: 0000000000000000
>> [   59.000871][    C0] RBP: ffff88813bc40d08 R08: ffff888106f5abb8 R09: 00000000fffffffe
>> [   59.000875][    C0] R10: ffffc90000003be0 R11: 00000000ffd17b4b R12: ffff88813bc118a0
>> [   59.000878][    C0] R13: ffff88813bc40c00 R14: 0000000000000000 R15: ffffffff82a1db40
>> [   59.000906][    C0]  ? x86_pmu_enable+0x383/0x440
>> [   59.000924][    C0]  ? __sanitizer_cov_trace_pc+0xd/0x60
>> [   59.000942][    C0]  ? intel_pmu_handle_irq+0x284/0x620
>> [   59.000954][    C0]  </NMI>
>> [   59.000957][    C0] WARNING: stack recursion on stack type 6
>> [   59.000960][    C0] Modules linked in:
>> [   59.120070][    C0] ---[ end trace 07eb1e3908914794 ]---
>> [   59.120075][    C0] RIP: 0010:perf_swevent_get_recursion_context+0x0/0x70
>> [   59.120087][    C0] Code: 48 03 43 28 48 8b 0c 24 bb 01 00 00 00 4c 29 f0 48 39 c8 48 0f 47 c1 49 89 45 08 e9 48 ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 <55> 53 e8 89 18 f2 ff 48 c7 c2 20 4d 03 00 65 48 03 15 5a 34 d2 7e
>> [   59.120092][    C0] RSP: 0018:fffffe000000b000 EFLAGS: 00010046
>> [   59.120098][    C0] RAX: 0000000080120005 RBX: fffffe000000b050 RCX: 0000000000000000
>> [   59.120102][    C0] RDX: ffff888106f5a180 RSI: ffffffff812696d1 RDI: 000000000000001c
>> [   59.120106][    C0] RBP: 000000000000001c R08: 0000000000000001 R09: 0000000000000000
>> [   59.120110][    C0] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
>> [   59.120114][    C0] R13: fffffe000000b044 R14: 0000000000000001 R15: 0000000000000001
>> [   59.120118][    C0] FS:  00007f21fc62c740(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000
>> [   59.120125][    C0] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [   59.120129][    C0] CR2: fffffe000000aff8 CR3: 0000000106e2e001 CR4: 00000000003606f0
>> [   59.120133][    C0] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> [   59.120137][    C0] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> [   59.120141][    C0] Kernel panic - not syncing: Fatal exception in interrupt
>> [   59.120540][    C0] Kernel Offset: disabled
>>
>> And below is the way of reproduce:
>>
>>
>> // autogenerated by syzkaller (https://github.com/google/syzkaller)
>>
>> #define _GNU_SOURCE
>>
>> #include <dirent.h>
>> #include <endian.h>
>> #include <errno.h>
>> #include <fcntl.h>
>> #include <signal.h>
>> #include <stdarg.h>
>> #include <stdbool.h>
>> #include <stdint.h>
>> #include <stdio.h>
>> #include <stdlib.h>
>> #include <string.h>
>> #include <sys/prctl.h>
>> #include <sys/stat.h>
>> #include <sys/syscall.h>
>> #include <sys/types.h>
>> #include <sys/wait.h>
>> #include <time.h>
>> #include <unistd.h>
>>
>> static void sleep_ms(uint64_t ms)
>> {
>> 	usleep(ms * 1000);
>> }
>>
>> static uint64_t current_time_ms(void)
>> {
>> 	struct timespec ts;
>> 	if (clock_gettime(CLOCK_MONOTONIC, &ts))
>> 	exit(1);
>> 	return (uint64_t)ts.tv_sec * 1000 + (uint64_t)ts.tv_nsec / 1000000;
>> }
>>
>> #define BITMASK(bf_off,bf_len) (((1ull << (bf_len)) - 1) << (bf_off))
>> #define STORE_BY_BITMASK(type,htobe,addr,val,bf_off,bf_len) *(type*)(addr) = htobe((htobe(*(type*)(addr)) & ~BITMASK((bf_off), (bf_len))) | (((type)(val) << (bf_off)) & BITMASK((bf_off), (bf_len))))
>>
>> static bool write_file(const char* file, const char* what, ...)
>> {
>> 	char buf[1024];
>> 	va_list args;
>> 	va_start(args, what);
>> 	vsnprintf(buf, sizeof(buf), what, args);
>> 	va_end(args);
>> 	buf[sizeof(buf) - 1] = 0;
>> 	int len = strlen(buf);
>> 	int fd = open(file, O_WRONLY | O_CLOEXEC);
>> 	if (fd == -1)
>> 		return false;
>> 	if (write(fd, buf, len) != len) {
>> 		int err = errno;
>> 		close(fd);
>> 		errno = err;
>> 		return false;
>> 	}
>> 	close(fd);
>> 	return true;
>> }
>>
>> static void kill_and_wait(int pid, int* status)
>> {
>> 	kill(-pid, SIGKILL);
>> 	kill(pid, SIGKILL);
>> 	for (int i = 0; i < 100; i++) {
>> 		if (waitpid(-1, status, WNOHANG | __WALL) == pid)
>> 			return;
>> 		usleep(1000);
>> 	}
>> 	DIR* dir = opendir("/sys/fs/fuse/connections");
>> 	if (dir) {
>> 		for (;;) {
>> 			struct dirent* ent = readdir(dir);
>> 			if (!ent)
>> 				break;
>> 			if (strcmp(ent->d_name, ".") == 0 || strcmp(ent->d_name, "..") == 0)
>> 				continue;
>> 			char abort[300];
>> 			snprintf(abort, sizeof(abort), "/sys/fs/fuse/connections/%s/abort", ent->d_name);
>> 			int fd = open(abort, O_WRONLY);
>> 			if (fd == -1) {
>> 				continue;
>> 			}
>> 			if (write(fd, abort, 1) < 0) {
>> 			}
>> 			close(fd);
>> 		}
>> 		closedir(dir);
>> 	} else {
>> 	}
>> 	while (waitpid(-1, status, __WALL) != pid) {
>> 	}
>> }
>>
>> static void setup_test()
>> {
>> 	prctl(PR_SET_PDEATHSIG, SIGKILL, 0, 0, 0);
>> 	setpgrp();
>> 	write_file("/proc/self/oom_score_adj", "1000");
>> }
>>
>> static void execute_one(void);
>>
>> #define WAIT_FLAGS __WALL
>>
>> static void loop(void)
>> {
>> 	int iter = 0;
>> 	for (;; iter++) {
>> 		int pid = fork();
>> 		if (pid < 0)
>> 	exit(1);
>> 		if (pid == 0) {
>> 			setup_test();
>> 			execute_one();
>> 			exit(0);
>> 		}
>> 		int status = 0;
>> 		uint64_t start = current_time_ms();
>> 		for (;;) {
>> 			if (waitpid(-1, &status, WNOHANG | WAIT_FLAGS) == pid)
>> 				break;
>> 			sleep_ms(1);
>> 		if (current_time_ms() - start < 5000) {
>> 			continue;
>> 		}
>> 			kill_and_wait(pid, &status);
>> 			break;
>> 		}
>> 	}
>> }
>>
>> void execute_one(void)
>> {
>> *(uint32_t*)0x20000380 = 2;
>> *(uint32_t*)0x20000384 = 0x70;
>> *(uint8_t*)0x20000388 = 1;
>> *(uint8_t*)0x20000389 = 0;
>> *(uint8_t*)0x2000038a = 0;
>> *(uint8_t*)0x2000038b = 0;
>> *(uint32_t*)0x2000038c = 0;
>> *(uint64_t*)0x20000390 = 0;
>> *(uint64_t*)0x20000398 = 0;
>> *(uint64_t*)0x200003a0 = 0;
>> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 0, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 1, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 2, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 3, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 4, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 5, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 6, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 7, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 8, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 9, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 10, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 11, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 12, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 13, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 14, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 15, 2);
>> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 17, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 18, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 19, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 20, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 21, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 22, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 23, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 24, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 25, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 26, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 27, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 28, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200003a8, 0, 29, 35);
>> *(uint32_t*)0x200003b0 = 0;
>> *(uint32_t*)0x200003b4 = 0;
>> *(uint64_t*)0x200003b8 = 0;
>> *(uint64_t*)0x200003c0 = 0;
>> *(uint64_t*)0x200003c8 = 0;
>> *(uint64_t*)0x200003d0 = 0;
>> *(uint32_t*)0x200003d8 = 0;
>> *(uint32_t*)0x200003dc = 0;
>> *(uint64_t*)0x200003e0 = 0;
>> *(uint32_t*)0x200003e8 = 0;
>> *(uint16_t*)0x200003ec = 0;
>> *(uint16_t*)0x200003ee = 0;
>> 	syscall(__NR_perf_event_open, 0x20000380ul, -1, 0ul, -1, 0ul);
>> *(uint32_t*)0x20000080 = 0;
>> *(uint32_t*)0x20000084 = 0x70;
>> *(uint8_t*)0x20000088 = 0;
>> *(uint8_t*)0x20000089 = 0;
>> *(uint8_t*)0x2000008a = 0;
>> *(uint8_t*)0x2000008b = 0;
>> *(uint32_t*)0x2000008c = 0;
>> *(uint64_t*)0x20000090 = 0x9c;
>> *(uint64_t*)0x20000098 = 0;
>> *(uint64_t*)0x200000a0 = 0;
>> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 0, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 1, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 2, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 3, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 4, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 5, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 6, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 7, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 8, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 9, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 10, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 11, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 12, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 13, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 14, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 15, 2);
>> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 17, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 18, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 19, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 20, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 21, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 22, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 23, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 24, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 25, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 26, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 27, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 28, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x200000a8, 0, 29, 35);
>> *(uint32_t*)0x200000b0 = 0;
>> *(uint32_t*)0x200000b4 = 0;
>> *(uint64_t*)0x200000b8 = 0;
>> *(uint64_t*)0x200000c0 = 0;
>> *(uint64_t*)0x200000c8 = 0;
>> *(uint64_t*)0x200000d0 = 0;
>> *(uint32_t*)0x200000d8 = 0;
>> *(uint32_t*)0x200000dc = 0;
>> *(uint64_t*)0x200000e0 = 0;
>> *(uint32_t*)0x200000e8 = 0;
>> *(uint16_t*)0x200000ec = 0;
>> *(uint16_t*)0x200000ee = 0;
>> 	syscall(__NR_perf_event_open, 0x20000080ul, -1, 0ul, -1, 0ul);
>> *(uint32_t*)0x20000140 = 2;
>> *(uint32_t*)0x20000144 = 0x70;
>> *(uint8_t*)0x20000148 = 0x47;
>> *(uint8_t*)0x20000149 = 1;
>> *(uint8_t*)0x2000014a = 0;
>> *(uint8_t*)0x2000014b = 0;
>> *(uint32_t*)0x2000014c = 0;
>> *(uint64_t*)0x20000150 = 9;
>> *(uint64_t*)0x20000158 = 0x61220;
>> *(uint64_t*)0x20000160 = 0;
>> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 0, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 1, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 2, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 3, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 4, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 5, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 6, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 7, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 8, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 9, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 10, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 11, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 12, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 13, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 14, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 15, 2);
>> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 17, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 18, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 19, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 20, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 21, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 22, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 23, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 24, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 25, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 26, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 27, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 28, 1);
>> STORE_BY_BITMASK(uint64_t, , 0x20000168, 0, 29, 35);
>> *(uint32_t*)0x20000170 = 0;
>> *(uint32_t*)0x20000174 = 0;
>> *(uint64_t*)0x20000178 = 0;
>> *(uint64_t*)0x20000180 = 0;
>> *(uint64_t*)0x20000188 = 0;
>> *(uint64_t*)0x20000190 = 1;
>> *(uint32_t*)0x20000198 = 0;
>> *(uint32_t*)0x2000019c = 0;
>> *(uint64_t*)0x200001a0 = 2;
>> *(uint32_t*)0x200001a8 = 0;
>> *(uint16_t*)0x200001ac = 0;
>> *(uint16_t*)0x200001ae = 0;
>> 	syscall(__NR_perf_event_open, 0x20000140ul, 0, -1ul, -1, 0ul);
>>
>> }
>> int main(void)
>> {
>> 		syscall(__NR_mmap, 0x1ffff000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul);
>> 	syscall(__NR_mmap, 0x20000000ul, 0x1000000ul, 7ul, 0x32ul, -1, 0ul);
>> 	syscall(__NR_mmap, 0x21000000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul);
>> 			loop();
>> 	return 0;
>> }
>>
>> Regards,
>> Michael Wang
>>
>>
>> On 2021/9/13 下午10:49, Dave Hansen wrote:
>>> On 9/12/21 8:30 PM, 王贇 wrote:
>>>> According to the trace we know the story is like this, the NMI
>>>> triggered perf IRQ throttling and call perf_log_throttle(),
>>>> which triggered the swevent overflow, and the overflow process
>>>> do perf_callchain_user() which triggered a user PF, and the PF
>>>> process triggered perf ftrace which finally lead into a suspected
>>>> stack overflow.
>>>>
>>>> This patch disable ftrace on fault.c, which help to avoid the panic.
>>> ...
>>>> +# Disable ftrace to avoid stack overflow.
>>>> +CFLAGS_REMOVE_fault.o = $(CC_FLAGS_FTRACE)
>>>
>>> Was this observed on a mainline kernel?
>>>
>>> How reproducible is this?
>>>
>>> I suspect we're going into do_user_addr_fault(), then falling in here:
>>>
>>>>         if (unlikely(faulthandler_disabled() || !mm)) {
>>>>                 bad_area_nosemaphore(regs, error_code, address);
>>>>                 return;
>>>>         }
>>>
>>> Then something double faults in perf_swevent_get_recursion_context().
>>> But, you snipped all of the register dump out so I can't quite see
>>> what's going on and what might have caused *that* fault.  But, in my
>>> kernel perf_swevent_get_recursion_context+0x0/0x70 is:
>>>
>>> 	   mov    $0x27d00,%rdx
>>>
>>> which is rather unlikely to fault.
>>>
>>> Either way, we don't want to keep ftrace out of fault.c.  This patch is
>>> just a hack, and doesn't really try to fix the underlying problem.  This
>>> situation *should* be handled today.  There's code there to handle it.
>>>
>>> Something else really funky is going on.
>>>
Dave Hansen Sept. 14, 2021, 4:16 p.m. UTC | #6
On 9/14/21 12:23 AM, 王贇 wrote:
> 
> On 2021/9/14 上午11:02, 王贇 wrote:
> [snip]
>> [   44.133509][    C0] traps: PANIC: double fault, error_code: 0x0
>> [   44.133519][    C0] double fault: 0000 [#1] SMP PTI
>> [   44.133526][    C0] CPU: 0 PID: 743 Comm: a.out Not tainted 5.14.0-next-20210913 #469
>> [   44.133532][    C0] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
>> [   44.133536][    C0] RIP: 0010:perf_swevent_get_recursion_context+0x0/0x70
>> [   44.133549][    C0] Code: 48 03 43 28 48 8b 0c 24 bb 01 00 00 00 4c 29 f0 48 39 c8 48 0f 47 c1 49 89 45 08 e9 48 ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 <55> 53 e8 09 20 f2 ff 48 c7 c2 20 4d 03 00 65 48 03 15 5a 3b d2 7e
>> [   44.133556][    C0] RSP: 0018:fffffe000000b000 EFLAGS: 00010046
> Another information is that I have printed '__this_cpu_ist_bottom_va(NMI)'
> on cpu0, which is just the RSP fffffe000000b000, does this imply
> we got an overflowed NMI stack?

Yep.  I have the feeling some of your sanitizer and other debugging is
eating the stack:

> [   44.134987][    C0]  ? __sanitizer_cov_trace_pc+0x7/0x60
> [   44.135005][    C0]  ? kcov_common_handle+0x30/0x30

Just turning off tracing for the page fault handler is papering over the
problem.  It'll just come back later with a slightly different form.
王贇 Sept. 15, 2021, 1:56 a.m. UTC | #7
On 2021/9/15 上午12:16, Dave Hansen wrote:
> On 9/14/21 12:23 AM, 王贇 wrote:
>>
>> On 2021/9/14 上午11:02, 王贇 wrote:
>> [snip]
>>> [   44.133509][    C0] traps: PANIC: double fault, error_code: 0x0
>>> [   44.133519][    C0] double fault: 0000 [#1] SMP PTI
>>> [   44.133526][    C0] CPU: 0 PID: 743 Comm: a.out Not tainted 5.14.0-next-20210913 #469
>>> [   44.133532][    C0] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
>>> [   44.133536][    C0] RIP: 0010:perf_swevent_get_recursion_context+0x0/0x70
>>> [   44.133549][    C0] Code: 48 03 43 28 48 8b 0c 24 bb 01 00 00 00 4c 29 f0 48 39 c8 48 0f 47 c1 49 89 45 08 e9 48 ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 <55> 53 e8 09 20 f2 ff 48 c7 c2 20 4d 03 00 65 48 03 15 5a 3b d2 7e
>>> [   44.133556][    C0] RSP: 0018:fffffe000000b000 EFLAGS: 00010046
>> Another information is that I have printed '__this_cpu_ist_bottom_va(NMI)'
>> on cpu0, which is just the RSP fffffe000000b000, does this imply
>> we got an overflowed NMI stack?
> 
> Yep.  I have the feeling some of your sanitizer and other debugging is
> eating the stack:

Could be, in another thread we have confirmed the exception stack was
overflowed.

> 
>> [   44.134987][    C0]  ? __sanitizer_cov_trace_pc+0x7/0x60
>> [   44.135005][    C0]  ? kcov_common_handle+0x30/0x30
> 
> Just turning off tracing for the page fault handler is papering over the
> problem.  It'll just come back later with a slightly different form.
> 

Cool~ please let me know when you have the proper approach.

Regards,
Michael Wang
Dave Hansen Sept. 15, 2021, 3:27 a.m. UTC | #8
On 9/14/21 6:56 PM, 王贇 wrote:
>>> [   44.134987][    C0]  ? __sanitizer_cov_trace_pc+0x7/0x60
>>> [   44.135005][    C0]  ? kcov_common_handle+0x30/0x30
>> Just turning off tracing for the page fault handler is papering over the
>> problem.  It'll just come back later with a slightly different form.
>>
> Cool~ please let me know when you have the proper approach.

It's an entertaining issue, but I wasn't planning on fixing it myself.
王贇 Sept. 15, 2021, 7:22 a.m. UTC | #9
On 2021/9/15 上午11:27, Dave Hansen wrote:
> On 9/14/21 6:56 PM, 王贇 wrote:
>>>> [   44.134987][    C0]  ? __sanitizer_cov_trace_pc+0x7/0x60
>>>> [   44.135005][    C0]  ? kcov_common_handle+0x30/0x30
>>> Just turning off tracing for the page fault handler is papering over the
>>> problem.  It'll just come back later with a slightly different form.
>>>
>> Cool~ please let me know when you have the proper approach.
> 
> It's an entertaining issue, but I wasn't planning on fixing it myself.
> 

Do you have any suggestion on how should we fix the problem?

I'd like to help fix it, but sounds like all the known working approach
are not acceptable...

Regards,
Michael Wang
王贇 Sept. 15, 2021, 7:34 a.m. UTC | #10
On 2021/9/15 下午3:22, 王贇 wrote:
> 
> 
> On 2021/9/15 上午11:27, Dave Hansen wrote:
>> On 9/14/21 6:56 PM, 王贇 wrote:
>>>>> [   44.134987][    C0]  ? __sanitizer_cov_trace_pc+0x7/0x60
>>>>> [   44.135005][    C0]  ? kcov_common_handle+0x30/0x30
>>>> Just turning off tracing for the page fault handler is papering over the
>>>> problem.  It'll just come back later with a slightly different form.
>>>>
>>> Cool~ please let me know when you have the proper approach.
>>
>> It's an entertaining issue, but I wasn't planning on fixing it myself.
>>
> 
> Do you have any suggestion on how should we fix the problem?
> 
> I'd like to help fix it, but sounds like all the known working approach
> are not acceptable...

Hi, Dave, Peter

What if we just increase the stack size when ftrace enabled?

Maybe like:

diff --git a/arch/x86/include/asm/page_64_types.h b/arch/x86/include/asm/page_64_types.h
index a8d4ad85..bc2e0c1 100644
--- a/arch/x86/include/asm/page_64_types.h
+++ b/arch/x86/include/asm/page_64_types.h
@@ -12,10 +12,16 @@
 #define KASAN_STACK_ORDER 0
 #endif

+#ifdef CONFIG_FUNCTION_TRACER
+#define FTRACE_STACK_ORDER 1
+#else
+#define FTRACE_STACK_ORDER 0
+#endif
+
 #define THREAD_SIZE_ORDER      (2 + KASAN_STACK_ORDER)
 #define THREAD_SIZE  (PAGE_SIZE << THREAD_SIZE_ORDER)

-#define EXCEPTION_STACK_ORDER (0 + KASAN_STACK_ORDER)
+#define EXCEPTION_STACK_ORDER (0 + KASAN_STACK_ORDER + FTRACE_STACK_ORDER)
 #define EXCEPTION_STKSZ (PAGE_SIZE << EXCEPTION_STACK_ORDER)

 #define IRQ_STACK_ORDER (2 + KASAN_STACK_ORDER)

Just like kasan we give more stack space for ftrace, is this looks
acceptable to you?

Regards,
Michael Wang

> 
> Regards,
> Michael Wang
>
diff mbox series

Patch

diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile
index 5864219..1dbdca5 100644
--- a/arch/x86/mm/Makefile
+++ b/arch/x86/mm/Makefile
@@ -1,5 +1,9 @@ 
 # SPDX-License-Identifier: GPL-2.0
 # Kernel does not boot with instrumentation of tlb.c and mem_encrypt*.c
+
+# Disable ftrace to avoid stack overflow.
+CFLAGS_REMOVE_fault.o = $(CC_FLAGS_FTRACE)
+
 KCOV_INSTRUMENT_tlb.o			:= n
 KCOV_INSTRUMENT_mem_encrypt.o		:= n
 KCOV_INSTRUMENT_mem_encrypt_identity.o	:= n