Message ID | 20241108113455.2924361-1-elver@google.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v3,1/2] tracing: Add task_prctl_unknown tracepoint | expand |
On Fri, 8 Nov 2024 at 12:35, Marco Elver <elver@google.com> wrote: > > prctl() is a complex syscall which multiplexes its functionality based > on a large set of PR_* options. Currently we count 64 such options. The > return value of unknown options is -EINVAL, and doesn't distinguish from > known options that were passed invalid args that also return -EINVAL. > > To understand if programs are attempting to use prctl() options not yet > available on the running kernel, provide the task_prctl_unknown > tracepoint. > > Note, this tracepoint is in an unlikely cold path, and would therefore > be suitable for continuous monitoring (e.g. via perf_event_open). > > While the above is likely the simplest usecase, additionally this > tracepoint can help unlock some testing scenarios (where probing > sys_enter or sys_exit causes undesirable performance overheads): > > a. unprivileged triggering of a test module: test modules may register a > probe to be called back on task_prctl_unknown, and pick a very large > unknown prctl() option upon which they perform a test function for an > unprivileged user; > > b. unprivileged triggering of an eBPF program function: similar > as idea (a). > > Example trace_pipe output: > > test-380 [001] ..... 78.142904: task_prctl_unknown: option=1234 arg2=101 arg3=102 arg4=103 arg5=104 > > Signed-off-by: Marco Elver <elver@google.com> Steven, unless there are any further objections, would you be able to take this through the tracing tree? Many thanks! > --- > v3: > * Remove "comm". > > v2: > * Remove "pid" in trace output (suggested by Steven). > --- > include/trace/events/task.h | 37 +++++++++++++++++++++++++++++++++++++ > kernel/sys.c | 3 +++ > 2 files changed, 40 insertions(+) > > diff --git a/include/trace/events/task.h b/include/trace/events/task.h > index 47b527464d1a..209d315852fb 100644 > --- a/include/trace/events/task.h > +++ b/include/trace/events/task.h > @@ -56,6 +56,43 @@ TRACE_EVENT(task_rename, > __entry->newcomm, __entry->oom_score_adj) > ); > > +/** > + * task_prctl_unknown - called on unknown prctl() option > + * @option: option passed > + * @arg2: arg2 passed > + * @arg3: arg3 passed > + * @arg4: arg4 passed > + * @arg5: arg5 passed > + * > + * Called on an unknown prctl() option. > + */ > +TRACE_EVENT(task_prctl_unknown, > + > + TP_PROTO(int option, unsigned long arg2, unsigned long arg3, > + unsigned long arg4, unsigned long arg5), > + > + TP_ARGS(option, arg2, arg3, arg4, arg5), > + > + TP_STRUCT__entry( > + __field( int, option) > + __field( unsigned long, arg2) > + __field( unsigned long, arg3) > + __field( unsigned long, arg4) > + __field( unsigned long, arg5) > + ), > + > + TP_fast_assign( > + __entry->option = option; > + __entry->arg2 = arg2; > + __entry->arg3 = arg3; > + __entry->arg4 = arg4; > + __entry->arg5 = arg5; > + ), > + > + TP_printk("option=%d arg2=%ld arg3=%ld arg4=%ld arg5=%ld", > + __entry->option, __entry->arg2, __entry->arg3, __entry->arg4, __entry->arg5) > +); > + > #endif > > /* This part must be outside protection */ > diff --git a/kernel/sys.c b/kernel/sys.c > index 4da31f28fda8..b366cef102ec 100644 > --- a/kernel/sys.c > +++ b/kernel/sys.c > @@ -75,6 +75,8 @@ > #include <asm/io.h> > #include <asm/unistd.h> > > +#include <trace/events/task.h> > + > #include "uid16.h" > > #ifndef SET_UNALIGN_CTL > @@ -2785,6 +2787,7 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3, > error = RISCV_SET_ICACHE_FLUSH_CTX(arg2, arg3); > break; > default: > + trace_task_prctl_unknown(option, arg2, arg3, arg4, arg5); > error = -EINVAL; > break; > } > -- > 2.47.0.277.g8800431eea-goog >
On Fri, 15 Nov 2024 13:00:00 +0100 Marco Elver <elver@google.com> wrote: > Steven, unless there are any further objections, would you be able to > take this through the tracing tree? > > Many thanks! This isn't my file. Trace events usually belong to the subsystems that use them. As this adds an event to kernel/sys.c which doesn't really have an owner, then I would ask Andrew Morton to take it. -- Steve
On Fri, 15 Nov 2024 at 14:27, Steven Rostedt <rostedt@goodmis.org> wrote: > > On Fri, 15 Nov 2024 13:00:00 +0100 > Marco Elver <elver@google.com> wrote: > > > Steven, unless there are any further objections, would you be able to > > take this through the tracing tree? > > > > Many thanks! > > This isn't my file. Trace events usually belong to the subsystems that > use them. As this adds an event to kernel/sys.c which doesn't really have > an owner, then I would ask Andrew Morton to take it. Got it. Andrew, can you pick this up? Thanks, -- Marco
diff --git a/include/trace/events/task.h b/include/trace/events/task.h index 47b527464d1a..209d315852fb 100644 --- a/include/trace/events/task.h +++ b/include/trace/events/task.h @@ -56,6 +56,43 @@ TRACE_EVENT(task_rename, __entry->newcomm, __entry->oom_score_adj) ); +/** + * task_prctl_unknown - called on unknown prctl() option + * @option: option passed + * @arg2: arg2 passed + * @arg3: arg3 passed + * @arg4: arg4 passed + * @arg5: arg5 passed + * + * Called on an unknown prctl() option. + */ +TRACE_EVENT(task_prctl_unknown, + + TP_PROTO(int option, unsigned long arg2, unsigned long arg3, + unsigned long arg4, unsigned long arg5), + + TP_ARGS(option, arg2, arg3, arg4, arg5), + + TP_STRUCT__entry( + __field( int, option) + __field( unsigned long, arg2) + __field( unsigned long, arg3) + __field( unsigned long, arg4) + __field( unsigned long, arg5) + ), + + TP_fast_assign( + __entry->option = option; + __entry->arg2 = arg2; + __entry->arg3 = arg3; + __entry->arg4 = arg4; + __entry->arg5 = arg5; + ), + + TP_printk("option=%d arg2=%ld arg3=%ld arg4=%ld arg5=%ld", + __entry->option, __entry->arg2, __entry->arg3, __entry->arg4, __entry->arg5) +); + #endif /* This part must be outside protection */ diff --git a/kernel/sys.c b/kernel/sys.c index 4da31f28fda8..b366cef102ec 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -75,6 +75,8 @@ #include <asm/io.h> #include <asm/unistd.h> +#include <trace/events/task.h> + #include "uid16.h" #ifndef SET_UNALIGN_CTL @@ -2785,6 +2787,7 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3, error = RISCV_SET_ICACHE_FLUSH_CTX(arg2, arg3); break; default: + trace_task_prctl_unknown(option, arg2, arg3, arg4, arg5); error = -EINVAL; break; }
prctl() is a complex syscall which multiplexes its functionality based on a large set of PR_* options. Currently we count 64 such options. The return value of unknown options is -EINVAL, and doesn't distinguish from known options that were passed invalid args that also return -EINVAL. To understand if programs are attempting to use prctl() options not yet available on the running kernel, provide the task_prctl_unknown tracepoint. Note, this tracepoint is in an unlikely cold path, and would therefore be suitable for continuous monitoring (e.g. via perf_event_open). While the above is likely the simplest usecase, additionally this tracepoint can help unlock some testing scenarios (where probing sys_enter or sys_exit causes undesirable performance overheads): a. unprivileged triggering of a test module: test modules may register a probe to be called back on task_prctl_unknown, and pick a very large unknown prctl() option upon which they perform a test function for an unprivileged user; b. unprivileged triggering of an eBPF program function: similar as idea (a). Example trace_pipe output: test-380 [001] ..... 78.142904: task_prctl_unknown: option=1234 arg2=101 arg3=102 arg4=103 arg5=104 Signed-off-by: Marco Elver <elver@google.com> --- v3: * Remove "comm". v2: * Remove "pid" in trace output (suggested by Steven). --- include/trace/events/task.h | 37 +++++++++++++++++++++++++++++++++++++ kernel/sys.c | 3 +++ 2 files changed, 40 insertions(+)