diff mbox series

[v1,net-next] ref_tracker: Print allocator task name.

Message ID 20240403201715.33883-1-kuniyu@amazon.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series [v1,net-next] ref_tracker: Print allocator task name. | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 945 this patch: 945
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers fail 1 maintainers not CCed: akpm@linux-foundation.org
netdev/build_clang success Errors and warnings before: 955 this patch: 955
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 956 this patch: 956
netdev/checkpatch warning WARNING: line length of 84 exceeds 80 columns WARNING: line length of 88 exceeds 80 columns
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest fail net-next-2024-04-04--03-00 (tests: 942)

Commit Message

Kuniyuki Iwashima April 3, 2024, 8:17 p.m. UTC
Even after syzkaller has triggered a bug, it often takes a long time
to find a repro.  In such a case, I usually try to find a repro with
syz-prog2c based on the syz-lang program in the log.

When ref_tracker detects a leaked reference, it shows a stack trace
where the tracker is allocated.  However, the stack trace does not
include the process name.

If a stack trace prints the allocator name, it would be easier to
salvage one syz-executor.X out of several candidates.

  20:58:00 executing program 5:
  ...
  [ 2792.008275][T406785] CPU: 0 PID: 406785 Comm: syz-executor.5

To make debug easier, let's save the task name to ref_tracker and
print it with the stack trace.

Tested with a buggy module [0]:

  # unshare -n insmod ./kern_sk.ko
  kern_sk: loading out-of-tree module taints kernel.
  ref_tracker: net notrefcnt@0000000019e0eaac was allocated by insmod and has 1/1 users at
       sk_alloc+0x498/0x4c0
       inet_create+0x128/0x530
       __sock_create+0x17a/0x3a0
       do_one_initcall+0x57/0x2a0
       do_init_module+0x5f/0x210
       init_module_from_file+0x86/0xc0
       idempotent_init_module+0x178/0x230
       __x64_sys_finit_module+0x56/0x90
       do_syscall_64+0xc4/0x1d0
       entry_SYSCALL_64_after_hwframe+0x46/0x4e

  ------------[ cut here ]------------
  WARNING: CPU: 2 PID: 48 at lib/ref_tracker.c:184 ref_tracker_dir_exit+0xfb/0x160
  Modules linked in: kern_sk(O)
  CPU: 2 PID: 48 Comm: kworker/u16:2 Tainted: G           O       6.9.0-rc1-00371-g48dca48885cd-dirty #8

Link: https://lore.kernel.org/netdev/20221021170154.88207-1-kuniyu@amazon.com/ [0]
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
 lib/ref_tracker.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

Comments

Paolo Abeni April 4, 2024, 2:42 p.m. UTC | #1
On Wed, 2024-04-03 at 13:17 -0700, Kuniyuki Iwashima wrote:
> @@ -208,6 +213,8 @@ int ref_tracker_alloc(struct ref_tracker_dir *dir,
>  	}
>  	nr_entries = stack_trace_save(entries, ARRAY_SIZE(entries), 1);
>  	tracker->alloc_stack_handle = stack_depot_save(entries, nr_entries, gfp);
> +	if (in_task())
> +		get_task_comm(tracker->comm, current);

This apparently causes a lockdep splat, hit by the CI:

https://netdev-3.bots.linux.dev/vmksft-net-dbg/results/537021/16-vrf-route-leaking-sh/stderr

it looks like get_task_comm() is for BH-only scope.

Cheers,

Paolo
Kuniyuki Iwashima April 4, 2024, 4:51 p.m. UTC | #2
From: Paolo Abeni <pabeni@redhat.com>
Date: Thu, 04 Apr 2024 16:42:55 +0200
> On Wed, 2024-04-03 at 13:17 -0700, Kuniyuki Iwashima wrote:
> > @@ -208,6 +213,8 @@ int ref_tracker_alloc(struct ref_tracker_dir *dir,
> >  	}
> >  	nr_entries = stack_trace_save(entries, ARRAY_SIZE(entries), 1);
> >  	tracker->alloc_stack_handle = stack_depot_save(entries, nr_entries, gfp);
> > +	if (in_task())
> > +		get_task_comm(tracker->comm, current);
> 
> This apparently causes a lockdep splat, hit by the CI:
> 
> https://netdev-3.bots.linux.dev/vmksft-net-dbg/results/537021/16-vrf-route-leaking-sh/stderr
> 
> it looks like get_task_comm() is for BH-only scope.

Ah exectly, I'll move it down to the spin_lock_irqsave() section below.

Thanks!
diff mbox series

Patch

diff --git a/lib/ref_tracker.c b/lib/ref_tracker.c
index cf5609b1ca79..91c73725acf5 100644
--- a/lib/ref_tracker.c
+++ b/lib/ref_tracker.c
@@ -5,6 +5,7 @@ 
 #include <linux/export.h>
 #include <linux/list_sort.h>
 #include <linux/ref_tracker.h>
+#include <linux/sched.h>
 #include <linux/slab.h>
 #include <linux/stacktrace.h>
 #include <linux/stackdepot.h>
@@ -17,6 +18,7 @@  struct ref_tracker {
 	bool			dead;
 	depot_stack_handle_t	alloc_stack_handle;
 	depot_stack_handle_t	free_stack_handle;
+	char			comm[TASK_COMM_LEN];
 };
 
 struct ref_tracker_dir_stats {
@@ -25,6 +27,7 @@  struct ref_tracker_dir_stats {
 	struct {
 		depot_stack_handle_t stack_handle;
 		unsigned int count;
+		char comm[TASK_COMM_LEN];
 	} stacks[];
 };
 
@@ -54,6 +57,7 @@  ref_tracker_get_stats(struct ref_tracker_dir *dir, unsigned int limit)
 		if (i >= stats->count) {
 			stats->stacks[i].stack_handle = stack;
 			stats->stacks[i].count = 0;
+			memcpy(stats->stacks[i].comm, tracker->comm, TASK_COMM_LEN);
 			++stats->count;
 		}
 		++stats->stacks[i].count;
@@ -107,7 +111,8 @@  __ref_tracker_dir_pr_ostream(struct ref_tracker_dir *dir,
 		stack = stats->stacks[i].stack_handle;
 		if (sbuf && !stack_depot_snprint(stack, sbuf, STACK_BUF_SIZE, 4))
 			sbuf[0] = 0;
-		pr_ostream(s, "%s@%pK has %d/%d users at\n%s\n", dir->name, dir,
+		pr_ostream(s, "%s@%pK was allocated by %s and has %d/%d users at\n%s\n",
+			   dir->name, dir, stats->stacks[i].comm,
 			   stats->stacks[i].count, stats->total, sbuf);
 		skipped -= stats->stacks[i].count;
 	}
@@ -208,6 +213,8 @@  int ref_tracker_alloc(struct ref_tracker_dir *dir,
 	}
 	nr_entries = stack_trace_save(entries, ARRAY_SIZE(entries), 1);
 	tracker->alloc_stack_handle = stack_depot_save(entries, nr_entries, gfp);
+	if (in_task())
+		get_task_comm(tracker->comm, current);
 
 	spin_lock_irqsave(&dir->lock, flags);
 	list_add(&tracker->head, &dir->list);
@@ -244,7 +251,7 @@  int ref_tracker_free(struct ref_tracker_dir *dir,
 	if (tracker->dead) {
 		pr_err("reference already released.\n");
 		if (tracker->alloc_stack_handle) {
-			pr_err("allocated in:\n");
+			pr_err("allocated by %s in:\n", tracker->comm);
 			stack_depot_print(tracker->alloc_stack_handle);
 		}
 		if (tracker->free_stack_handle) {