Message ID | 20241003151638.1608537-2-mathieu.desnoyers@efficios.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | tracing: Allow system call tracepoints to handle page faults | expand |
On Thu, 3 Oct 2024 11:16:31 -0400 Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote: > @@ -283,8 +290,13 @@ static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p) > "RCU not watching for tracepoint"); \ > } \ > } \ > - __DECLARE_TRACE_RCU(name, PARAMS(proto), PARAMS(args), \ > - PARAMS(cond)) \ > + static inline void trace_##name##_rcuidle(proto) \ > + { \ > + if (static_key_false(&__tracepoint_##name.key)) \ > + __DO_TRACE(name, \ > + TP_ARGS(args), \ > + TP_CONDITION(cond), 1); \ > + } \ > static inline int \ > register_trace_##name(void (*probe)(data_proto), void *data) \ > { \ Looking at this part of your change, I realized it's time to nuke the rcuidle() variant. Feel free to rebase on top of this patch: https://lore.kernel.org/all/20241003173051.6b178bb3@gandalf.local.home/ -- Steve
On 2024-10-03 23:32, Steven Rostedt wrote: > On Thu, 3 Oct 2024 11:16:31 -0400 > Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote: > >> @@ -283,8 +290,13 @@ static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p) >> "RCU not watching for tracepoint"); \ >> } \ >> } \ >> - __DECLARE_TRACE_RCU(name, PARAMS(proto), PARAMS(args), \ >> - PARAMS(cond)) \ >> + static inline void trace_##name##_rcuidle(proto) \ >> + { \ >> + if (static_key_false(&__tracepoint_##name.key)) \ >> + __DO_TRACE(name, \ >> + TP_ARGS(args), \ >> + TP_CONDITION(cond), 1); \ >> + } \ >> static inline int \ >> register_trace_##name(void (*probe)(data_proto), void *data) \ >> { \ > > Looking at this part of your change, I realized it's time to nuke the > rcuidle() variant. > > Feel free to rebase on top of this patch: > > https://lore.kernel.org/all/20241003173051.6b178bb3@gandalf.local.home/ > I will. But you realize that you could have done all this SRCU and rcuidle nuking on top of my own series rather than pull the rug under my feet and require me to re-do this series again ? Grumpily, Mathieu
On Thu, 3 Oct 2024 20:15:25 -0400 Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote: > > Feel free to rebase on top of this patch: > > > > https://lore.kernel.org/all/20241003173051.6b178bb3@gandalf.local.home/ > > > > I will. But you realize that you could have done all this SRCU and > rcuidle nuking on top of my own series rather than pull the rug > under my feet and require me to re-do this series again ? I thought I was doing you a favor! It's removing a lot of code and would make your code simpler. ;-) -- Steve
On 2024-10-04 03:06, Steven Rostedt wrote: > On Thu, 3 Oct 2024 20:15:25 -0400 > Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote: > >>> Feel free to rebase on top of this patch: >>> >>> https://lore.kernel.org/all/20241003173051.6b178bb3@gandalf.local.home/ >>> >> >> I will. But you realize that you could have done all this SRCU and >> rcuidle nuking on top of my own series rather than pull the rug >> under my feet and require me to re-do this series again ? > > I thought I was doing you a favor! It's removing a lot of code and would > make your code simpler. ;-) The rebase was indeed not so bad. Thanks, Mathieu
Hi Mathieu, kernel test robot noticed the following build errors: [auto build test ERROR on peterz-queue/sched/core] [also build test ERROR on linus/master tip/core/entry v6.12-rc1 next-20241004] [cannot apply to rostedt-trace/for-next rostedt-trace/for-next-urgent] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Mathieu-Desnoyers/tracing-Declare-system-call-tracepoints-with-TRACE_EVENT_SYSCALL/20241003-232114 base: https://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git sched/core patch link: https://lore.kernel.org/r/20241003151638.1608537-2-mathieu.desnoyers%40efficios.com patch subject: [PATCH v1 1/8] tracing: Declare system call tracepoints with TRACE_EVENT_SYSCALL config: powerpc-randconfig-r071-20241004 (https://download.01.org/0day-ci/archive/20241004/202410041838.pOZuOGTX-lkp@intel.com/config) compiler: powerpc-linux-gcc (GCC) 14.1.0 reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241004/202410041838.pOZuOGTX-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202410041838.pOZuOGTX-lkp@intel.com/ All errors (new ones prefixed by >>): arch/powerpc/kernel/ptrace/ptrace.c: In function 'do_syscall_trace_enter': >> arch/powerpc/kernel/ptrace/ptrace.c:298:17: error: implicit declaration of function 'trace_sys_enter'; did you mean 'ftrace_nmi_enter'? [-Wimplicit-function-declaration] 298 | trace_sys_enter(regs, regs->gpr[0]); | ^~~~~~~~~~~~~~~ | ftrace_nmi_enter arch/powerpc/kernel/ptrace/ptrace.c: In function 'do_syscall_trace_leave': >> arch/powerpc/kernel/ptrace/ptrace.c:329:17: error: implicit declaration of function 'trace_sys_exit'; did you mean 'ftrace_nmi_exit'? [-Wimplicit-function-declaration] 329 | trace_sys_exit(regs, regs->result); | ^~~~~~~~~~~~~~ | ftrace_nmi_exit Kconfig warnings: (for reference only) WARNING: unmet direct dependencies detected for GET_FREE_REGION Depends on [n]: SPARSEMEM [=n] Selected by [y]: - RESOURCE_KUNIT_TEST [=y] && RUNTIME_TESTING_MENU [=y] && KUNIT [=y] vim +298 arch/powerpc/kernel/ptrace/ptrace.c 2449acc5348b94 arch/powerpc/kernel/ptrace.c Michael Ellerman 2015-07-23 235 d38374142b2560 arch/powerpc/kernel/ptrace.c Michael Ellerman 2015-07-23 236 /** d38374142b2560 arch/powerpc/kernel/ptrace.c Michael Ellerman 2015-07-23 237 * do_syscall_trace_enter() - Do syscall tracing on kernel entry. d38374142b2560 arch/powerpc/kernel/ptrace.c Michael Ellerman 2015-07-23 238 * @regs: the pt_regs of the task to trace (current) d38374142b2560 arch/powerpc/kernel/ptrace.c Michael Ellerman 2015-07-23 239 * d38374142b2560 arch/powerpc/kernel/ptrace.c Michael Ellerman 2015-07-23 240 * Performs various types of tracing on syscall entry. This includes seccomp, d38374142b2560 arch/powerpc/kernel/ptrace.c Michael Ellerman 2015-07-23 241 * ptrace, syscall tracepoints and audit. d38374142b2560 arch/powerpc/kernel/ptrace.c Michael Ellerman 2015-07-23 242 * d38374142b2560 arch/powerpc/kernel/ptrace.c Michael Ellerman 2015-07-23 243 * The pt_regs are potentially visible to userspace via ptrace, so their d38374142b2560 arch/powerpc/kernel/ptrace.c Michael Ellerman 2015-07-23 244 * contents is ABI. d38374142b2560 arch/powerpc/kernel/ptrace.c Michael Ellerman 2015-07-23 245 * d38374142b2560 arch/powerpc/kernel/ptrace.c Michael Ellerman 2015-07-23 246 * One or more of the tracers may modify the contents of pt_regs, in particular d38374142b2560 arch/powerpc/kernel/ptrace.c Michael Ellerman 2015-07-23 247 * to modify arguments or even the syscall number itself. d38374142b2560 arch/powerpc/kernel/ptrace.c Michael Ellerman 2015-07-23 248 * d38374142b2560 arch/powerpc/kernel/ptrace.c Michael Ellerman 2015-07-23 249 * It's also possible that a tracer can choose to reject the system call. In d38374142b2560 arch/powerpc/kernel/ptrace.c Michael Ellerman 2015-07-23 250 * that case this function will return an illegal syscall number, and will put d38374142b2560 arch/powerpc/kernel/ptrace.c Michael Ellerman 2015-07-23 251 * an appropriate return value in regs->r3. d38374142b2560 arch/powerpc/kernel/ptrace.c Michael Ellerman 2015-07-23 252 * d38374142b2560 arch/powerpc/kernel/ptrace.c Michael Ellerman 2015-07-23 253 * Return: the (possibly changed) syscall number. ^1da177e4c3f41 arch/ppc/kernel/ptrace.c Linus Torvalds 2005-04-16 254 */ 4f72c4279eab1e arch/powerpc/kernel/ptrace.c Roland McGrath 2008-07-27 255 long do_syscall_trace_enter(struct pt_regs *regs) ea9c102cb0a796 arch/ppc/kernel/ptrace.c David Woodhouse 2005-05-08 256 { 8dbdec0bcb416d arch/powerpc/kernel/ptrace.c Dmitry V. Levin 2018-12-16 257 u32 flags; 8dbdec0bcb416d arch/powerpc/kernel/ptrace.c Dmitry V. Levin 2018-12-16 258 985faa78687de6 arch/powerpc/kernel/ptrace/ptrace.c Mark Rutland 2021-11-29 259 flags = read_thread_flags() & (_TIF_SYSCALL_EMU | _TIF_SYSCALL_TRACE); 8dbdec0bcb416d arch/powerpc/kernel/ptrace.c Dmitry V. Levin 2018-12-16 260 8dbdec0bcb416d arch/powerpc/kernel/ptrace.c Dmitry V. Levin 2018-12-16 261 if (flags) { 153474ba1a4aed arch/powerpc/kernel/ptrace/ptrace.c Eric W. Biederman 2022-01-27 262 int rc = ptrace_report_syscall_entry(regs); 8dbdec0bcb416d arch/powerpc/kernel/ptrace.c Dmitry V. Levin 2018-12-16 263 8dbdec0bcb416d arch/powerpc/kernel/ptrace.c Dmitry V. Levin 2018-12-16 264 if (unlikely(flags & _TIF_SYSCALL_EMU)) { 5521eb4bca2db7 arch/powerpc/kernel/ptrace.c Breno Leitao 2018-09-20 265 /* 8dbdec0bcb416d arch/powerpc/kernel/ptrace.c Dmitry V. Levin 2018-12-16 266 * A nonzero return code from 153474ba1a4aed arch/powerpc/kernel/ptrace/ptrace.c Eric W. Biederman 2022-01-27 267 * ptrace_report_syscall_entry() tells us to prevent 8dbdec0bcb416d arch/powerpc/kernel/ptrace.c Dmitry V. Levin 2018-12-16 268 * the syscall execution, but we are not going to 8dbdec0bcb416d arch/powerpc/kernel/ptrace.c Dmitry V. Levin 2018-12-16 269 * execute it anyway. a225f156740555 arch/powerpc/kernel/ptrace.c Elvira Khabirova 2018-12-07 270 * 8dbdec0bcb416d arch/powerpc/kernel/ptrace.c Dmitry V. Levin 2018-12-16 271 * Returning -1 will skip the syscall execution. We want 8dbdec0bcb416d arch/powerpc/kernel/ptrace.c Dmitry V. Levin 2018-12-16 272 * to avoid clobbering any registers, so we don't goto 8dbdec0bcb416d arch/powerpc/kernel/ptrace.c Dmitry V. Levin 2018-12-16 273 * the skip label below. 5521eb4bca2db7 arch/powerpc/kernel/ptrace.c Breno Leitao 2018-09-20 274 */ 5521eb4bca2db7 arch/powerpc/kernel/ptrace.c Breno Leitao 2018-09-20 275 return -1; 5521eb4bca2db7 arch/powerpc/kernel/ptrace.c Breno Leitao 2018-09-20 276 } 5521eb4bca2db7 arch/powerpc/kernel/ptrace.c Breno Leitao 2018-09-20 277 8dbdec0bcb416d arch/powerpc/kernel/ptrace.c Dmitry V. Levin 2018-12-16 278 if (rc) { 4f72c4279eab1e arch/powerpc/kernel/ptrace.c Roland McGrath 2008-07-27 279 /* 8dbdec0bcb416d arch/powerpc/kernel/ptrace.c Dmitry V. Levin 2018-12-16 280 * The tracer decided to abort the syscall. Note that 8dbdec0bcb416d arch/powerpc/kernel/ptrace.c Dmitry V. Levin 2018-12-16 281 * the tracer may also just change regs->gpr[0] to an 8dbdec0bcb416d arch/powerpc/kernel/ptrace.c Dmitry V. Levin 2018-12-16 282 * invalid syscall number, that is handled below on the 8dbdec0bcb416d arch/powerpc/kernel/ptrace.c Dmitry V. Levin 2018-12-16 283 * exit path. 4f72c4279eab1e arch/powerpc/kernel/ptrace.c Roland McGrath 2008-07-27 284 */ 1addc57e111b92 arch/powerpc/kernel/ptrace.c Kees Cook 2016-06-02 285 goto skip; 8dbdec0bcb416d arch/powerpc/kernel/ptrace.c Dmitry V. Levin 2018-12-16 286 } 8dbdec0bcb416d arch/powerpc/kernel/ptrace.c Dmitry V. Levin 2018-12-16 287 } 1addc57e111b92 arch/powerpc/kernel/ptrace.c Kees Cook 2016-06-02 288 1addc57e111b92 arch/powerpc/kernel/ptrace.c Kees Cook 2016-06-02 289 /* Run seccomp after ptrace; allow it to set gpr[3]. */ 1addc57e111b92 arch/powerpc/kernel/ptrace.c Kees Cook 2016-06-02 290 if (do_seccomp(regs)) 1addc57e111b92 arch/powerpc/kernel/ptrace.c Kees Cook 2016-06-02 291 return -1; 1addc57e111b92 arch/powerpc/kernel/ptrace.c Kees Cook 2016-06-02 292 1addc57e111b92 arch/powerpc/kernel/ptrace.c Kees Cook 2016-06-02 293 /* Avoid trace and audit when syscall is invalid. */ 1addc57e111b92 arch/powerpc/kernel/ptrace.c Kees Cook 2016-06-02 294 if (regs->gpr[0] >= NR_syscalls) 1addc57e111b92 arch/powerpc/kernel/ptrace.c Kees Cook 2016-06-02 295 goto skip; ea9c102cb0a796 arch/ppc/kernel/ptrace.c David Woodhouse 2005-05-08 296 02424d8966d803 arch/powerpc/kernel/ptrace.c Ian Munsie 2011-02-02 297 if (unlikely(test_thread_flag(TIF_SYSCALL_TRACEPOINT))) 02424d8966d803 arch/powerpc/kernel/ptrace.c Ian Munsie 2011-02-02 @298 trace_sys_enter(regs, regs->gpr[0]); 02424d8966d803 arch/powerpc/kernel/ptrace.c Ian Munsie 2011-02-02 299 cab175f9fa2973 arch/powerpc/kernel/ptrace.c Denis Kirjanov 2010-08-27 300 if (!is_32bit_task()) 91397401bb5072 arch/powerpc/kernel/ptrace.c Eric Paris 2014-03-11 301 audit_syscall_entry(regs->gpr[0], regs->gpr[3], regs->gpr[4], ea9c102cb0a796 arch/ppc/kernel/ptrace.c David Woodhouse 2005-05-08 302 regs->gpr[5], regs->gpr[6]); cfcd1705b61ecc arch/powerpc/kernel/ptrace.c David Woodhouse 2007-01-14 303 else 91397401bb5072 arch/powerpc/kernel/ptrace.c Eric Paris 2014-03-11 304 audit_syscall_entry(regs->gpr[0], cfcd1705b61ecc arch/powerpc/kernel/ptrace.c David Woodhouse 2007-01-14 305 regs->gpr[3] & 0xffffffff, cfcd1705b61ecc arch/powerpc/kernel/ptrace.c David Woodhouse 2007-01-14 306 regs->gpr[4] & 0xffffffff, cfcd1705b61ecc arch/powerpc/kernel/ptrace.c David Woodhouse 2007-01-14 307 regs->gpr[5] & 0xffffffff, cfcd1705b61ecc arch/powerpc/kernel/ptrace.c David Woodhouse 2007-01-14 308 regs->gpr[6] & 0xffffffff); 4f72c4279eab1e arch/powerpc/kernel/ptrace.c Roland McGrath 2008-07-27 309 1addc57e111b92 arch/powerpc/kernel/ptrace.c Kees Cook 2016-06-02 310 /* Return the possibly modified but valid syscall number */ 1addc57e111b92 arch/powerpc/kernel/ptrace.c Kees Cook 2016-06-02 311 return regs->gpr[0]; 1addc57e111b92 arch/powerpc/kernel/ptrace.c Kees Cook 2016-06-02 312 1addc57e111b92 arch/powerpc/kernel/ptrace.c Kees Cook 2016-06-02 313 skip: d38374142b2560 arch/powerpc/kernel/ptrace.c Michael Ellerman 2015-07-23 314 /* d38374142b2560 arch/powerpc/kernel/ptrace.c Michael Ellerman 2015-07-23 315 * If we are aborting explicitly, or if the syscall number is d38374142b2560 arch/powerpc/kernel/ptrace.c Michael Ellerman 2015-07-23 316 * now invalid, set the return value to -ENOSYS. d38374142b2560 arch/powerpc/kernel/ptrace.c Michael Ellerman 2015-07-23 317 */ d38374142b2560 arch/powerpc/kernel/ptrace.c Michael Ellerman 2015-07-23 318 regs->gpr[3] = -ENOSYS; d38374142b2560 arch/powerpc/kernel/ptrace.c Michael Ellerman 2015-07-23 319 return -1; d38374142b2560 arch/powerpc/kernel/ptrace.c Michael Ellerman 2015-07-23 320 } d38374142b2560 arch/powerpc/kernel/ptrace.c Michael Ellerman 2015-07-23 321 ea9c102cb0a796 arch/ppc/kernel/ptrace.c David Woodhouse 2005-05-08 322 void do_syscall_trace_leave(struct pt_regs *regs) ea9c102cb0a796 arch/ppc/kernel/ptrace.c David Woodhouse 2005-05-08 323 { 4f72c4279eab1e arch/powerpc/kernel/ptrace.c Roland McGrath 2008-07-27 324 int step; 4f72c4279eab1e arch/powerpc/kernel/ptrace.c Roland McGrath 2008-07-27 325 d7e7528bcd456f arch/powerpc/kernel/ptrace.c Eric Paris 2012-01-03 326 audit_syscall_exit(regs); ea9c102cb0a796 arch/ppc/kernel/ptrace.c David Woodhouse 2005-05-08 327 02424d8966d803 arch/powerpc/kernel/ptrace.c Ian Munsie 2011-02-02 328 if (unlikely(test_thread_flag(TIF_SYSCALL_TRACEPOINT))) 02424d8966d803 arch/powerpc/kernel/ptrace.c Ian Munsie 2011-02-02 @329 trace_sys_exit(regs, regs->result); 02424d8966d803 arch/powerpc/kernel/ptrace.c Ian Munsie 2011-02-02 330 4f72c4279eab1e arch/powerpc/kernel/ptrace.c Roland McGrath 2008-07-27 331 step = test_thread_flag(TIF_SINGLESTEP); 4f72c4279eab1e arch/powerpc/kernel/ptrace.c Roland McGrath 2008-07-27 332 if (step || test_thread_flag(TIF_SYSCALL_TRACE)) 153474ba1a4aed arch/powerpc/kernel/ptrace/ptrace.c Eric W. Biederman 2022-01-27 333 ptrace_report_syscall_exit(regs, step); ea9c102cb0a796 arch/ppc/kernel/ptrace.c David Woodhouse 2005-05-08 334 } 002af9391bfbe8 arch/powerpc/kernel/ptrace.c Michael Ellerman 2018-10-12 335
diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h index 93a9f3070b48..666499b9f3be 100644 --- a/include/linux/tracepoint.h +++ b/include/linux/tracepoint.h @@ -268,10 +268,17 @@ static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p) * site if it is not watching, as it will need to be active when the * tracepoint is enabled. */ -#define __DECLARE_TRACE(name, proto, args, cond, data_proto) \ +#define __DECLARE_TRACE_COMMON(name, proto, args, cond, data_proto) \ extern int __traceiter_##name(data_proto); \ DECLARE_STATIC_CALL(tp_func_##name, __traceiter_##name); \ extern struct tracepoint __tracepoint_##name; \ + static inline void \ + check_trace_callback_type_##name(void (*cb)(data_proto)) \ + { \ + } \ + +#define __DECLARE_TRACE(name, proto, args, cond, data_proto) \ + __DECLARE_TRACE_COMMON(name, PARAMS(proto), PARAMS(args), cond, PARAMS(data_proto)) \ static inline void trace_##name(proto) \ { \ if (static_key_false(&__tracepoint_##name.key)) \ @@ -283,8 +290,13 @@ static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p) "RCU not watching for tracepoint"); \ } \ } \ - __DECLARE_TRACE_RCU(name, PARAMS(proto), PARAMS(args), \ - PARAMS(cond)) \ + static inline void trace_##name##_rcuidle(proto) \ + { \ + if (static_key_false(&__tracepoint_##name.key)) \ + __DO_TRACE(name, \ + TP_ARGS(args), \ + TP_CONDITION(cond), 1); \ + } \ static inline int \ register_trace_##name(void (*probe)(data_proto), void *data) \ { \ @@ -302,14 +314,42 @@ static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p) unregister_trace_##name(void (*probe)(data_proto), void *data) \ { \ return tracepoint_probe_unregister(&__tracepoint_##name,\ - (void *)probe, data); \ + (void *)probe, data);\ } \ - static inline void \ - check_trace_callback_type_##name(void (*cb)(data_proto)) \ + static inline bool \ + trace_##name##_enabled(void) \ + { \ + return static_key_false(&__tracepoint_##name.key); \ + } + + +#define __DECLARE_TRACE_SYSCALL(name, proto, args, cond, data_proto) \ + __DECLARE_TRACE_COMMON(name, PARAMS(proto), PARAMS(args), cond, PARAMS(data_proto)) \ + static inline void trace_syscall_##name(proto) \ + { \ + if (static_key_false(&__tracepoint_##name.key)) \ + __DO_TRACE(name, \ + TP_ARGS(args), \ + TP_CONDITION(cond), 0); \ + if (IS_ENABLED(CONFIG_LOCKDEP) && (cond)) { \ + WARN_ONCE(!rcu_is_watching(), \ + "RCU not watching for tracepoint"); \ + } \ + } \ + static inline int \ + register_trace_syscall_##name(void (*probe)(data_proto), void *data) \ { \ + return tracepoint_probe_register(&__tracepoint_##name, \ + (void *)probe, data); \ + } \ + static inline int \ + unregister_trace_syscall_##name(void (*probe)(data_proto), void *data) \ + { \ + return tracepoint_probe_unregister(&__tracepoint_##name,\ + (void *)probe, data);\ } \ static inline bool \ - trace_##name##_enabled(void) \ + trace_syscall_##name##_enabled(void) \ { \ return static_key_false(&__tracepoint_##name.key); \ } @@ -398,6 +438,27 @@ static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p) return false; \ } +#define __DECLARE_TRACE_SYSCALL(name, proto, args, cond, data_proto) \ + static inline void trace_syscall_##name(proto) \ + { } \ + static inline int \ + register_trace_syscall_##name(void (*probe)(data_proto), \ + void *data) \ + { \ + return -ENOSYS; \ + } \ + static inline int \ + unregister_trace_syscall_##name(void (*probe)(data_proto), \ + void *data) \ + { \ + return -ENOSYS; \ + } \ + static inline bool \ + trace_syscall_##name##_enabled(void) \ + { \ + return false; \ + } + #define DEFINE_TRACE_FN(name, reg, unreg, proto, args) #define DEFINE_TRACE(name, proto, args) #define EXPORT_TRACEPOINT_SYMBOL_GPL(name) @@ -459,6 +520,11 @@ static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p) cpu_online(raw_smp_processor_id()) && (PARAMS(cond)), \ PARAMS(void *__data, proto)) +#define DECLARE_TRACE_SYSCALL(name, proto, args) \ + __DECLARE_TRACE_SYSCALL(name, PARAMS(proto), PARAMS(args), \ + cpu_online(raw_smp_processor_id()), \ + PARAMS(void *__data, proto)) + #define TRACE_EVENT_FLAGS(event, flag) #define TRACE_EVENT_PERF_PERM(event, expr...) @@ -596,6 +662,9 @@ static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p) struct, assign, print) \ DECLARE_TRACE_CONDITION(name, PARAMS(proto), \ PARAMS(args), PARAMS(cond)) +#define TRACE_EVENT_SYSCALL(name, proto, args, struct, assign, \ + print, reg, unreg) \ + DECLARE_TRACE_SYSCALL(name, PARAMS(proto), PARAMS(args)) #define TRACE_EVENT_FLAGS(event, flag) diff --git a/include/trace/bpf_probe.h b/include/trace/bpf_probe.h index a2ea11cc912e..c85bbce5aaa5 100644 --- a/include/trace/bpf_probe.h +++ b/include/trace/bpf_probe.h @@ -53,6 +53,9 @@ __bpf_trace_##call(void *__data, proto) \ #define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print) \ __BPF_DECLARE_TRACE(call, PARAMS(proto), PARAMS(args)) +#undef DECLARE_EVENT_SYSCALL_CLASS +#define DECLARE_EVENT_SYSCALL_CLASS DECLARE_EVENT_CLASS + /* * This part is compiled out, it is only here as a build time check * to make sure that if the tracepoint handling changes, the diff --git a/include/trace/define_trace.h b/include/trace/define_trace.h index 00723935dcc7..ff5fa17a6259 100644 --- a/include/trace/define_trace.h +++ b/include/trace/define_trace.h @@ -46,6 +46,10 @@ assign, print, reg, unreg) \ DEFINE_TRACE_FN(name, reg, unreg, PARAMS(proto), PARAMS(args)) +#undef TRACE_EVENT_SYSCALL +#define TRACE_EVENT_SYSCALL(name, proto, args, struct, assign, print, reg, unreg) \ + DEFINE_TRACE_FN(name, reg, unreg, PARAMS(proto), PARAMS(args)) + #undef TRACE_EVENT_NOP #define TRACE_EVENT_NOP(name, proto, args, struct, assign, print) @@ -107,6 +111,7 @@ #undef TRACE_EVENT #undef TRACE_EVENT_FN #undef TRACE_EVENT_FN_COND +#undef TRACE_EVENT_SYSCALL #undef TRACE_EVENT_CONDITION #undef TRACE_EVENT_NOP #undef DEFINE_EVENT_NOP diff --git a/include/trace/events/syscalls.h b/include/trace/events/syscalls.h index b6e0cbc2c71f..f31ff446b468 100644 --- a/include/trace/events/syscalls.h +++ b/include/trace/events/syscalls.h @@ -15,7 +15,7 @@ #ifdef CONFIG_HAVE_SYSCALL_TRACEPOINTS -TRACE_EVENT_FN(sys_enter, +TRACE_EVENT_SYSCALL(sys_enter, TP_PROTO(struct pt_regs *regs, long id), @@ -41,7 +41,7 @@ TRACE_EVENT_FN(sys_enter, TRACE_EVENT_FLAGS(sys_enter, TRACE_EVENT_FL_CAP_ANY) -TRACE_EVENT_FN(sys_exit, +TRACE_EVENT_SYSCALL(sys_exit, TP_PROTO(struct pt_regs *regs, long ret), diff --git a/include/trace/perf.h b/include/trace/perf.h index 2c11181c82e0..ded997af481e 100644 --- a/include/trace/perf.h +++ b/include/trace/perf.h @@ -55,6 +55,9 @@ perf_trace_##call(void *__data, proto) \ head, __task); \ } +#undef DECLARE_EVENT_SYSCALL_CLASS +#define DECLARE_EVENT_SYSCALL_CLASS DECLARE_EVENT_CLASS + /* * This part is compiled out, it is only here as a build time check * to make sure that if the tracepoint handling changes, the diff --git a/include/trace/trace_events.h b/include/trace/trace_events.h index c2f9cabf154d..8bcbb9ee44de 100644 --- a/include/trace/trace_events.h +++ b/include/trace/trace_events.h @@ -45,6 +45,16 @@ PARAMS(print)); \ DEFINE_EVENT(name, name, PARAMS(proto), PARAMS(args)); +#undef TRACE_EVENT_SYSCALL +#define TRACE_EVENT_SYSCALL(name, proto, args, tstruct, assign, print, reg, unreg) \ + DECLARE_EVENT_SYSCALL_CLASS(name, \ + PARAMS(proto), \ + PARAMS(args), \ + PARAMS(tstruct), \ + PARAMS(assign), \ + PARAMS(print)); \ + DEFINE_EVENT(name, name, PARAMS(proto), PARAMS(args)); + #include "stages/stage1_struct_define.h" #undef DECLARE_EVENT_CLASS @@ -57,6 +67,9 @@ \ static struct trace_event_class event_class_##name; +#undef DECLARE_EVENT_SYSCALL_CLASS +#define DECLARE_EVENT_SYSCALL_CLASS DECLARE_EVENT_CLASS + #undef DEFINE_EVENT #define DEFINE_EVENT(template, name, proto, args) \ static struct trace_event_call __used \ @@ -117,6 +130,9 @@ tstruct; \ }; +#undef DECLARE_EVENT_SYSCALL_CLASS +#define DECLARE_EVENT_SYSCALL_CLASS DECLARE_EVENT_CLASS + #undef DEFINE_EVENT #define DEFINE_EVENT(template, name, proto, args) @@ -208,6 +224,9 @@ static struct trace_event_functions trace_event_type_funcs_##call = { \ .trace = trace_raw_output_##call, \ }; +#undef DECLARE_EVENT_SYSCALL_CLASS +#define DECLARE_EVENT_SYSCALL_CLASS DECLARE_EVENT_CLASS + #undef DEFINE_EVENT_PRINT #define DEFINE_EVENT_PRINT(template, call, proto, args, print) \ static notrace enum print_line_t \ @@ -265,6 +284,9 @@ static inline notrace int trace_event_get_offsets_##call( \ return __data_size; \ } +#undef DECLARE_EVENT_SYSCALL_CLASS +#define DECLARE_EVENT_SYSCALL_CLASS DECLARE_EVENT_CLASS + #include TRACE_INCLUDE(TRACE_INCLUDE_FILE) /* @@ -409,6 +431,9 @@ trace_event_raw_event_##call(void *__data, proto) \ * fail to compile unless it too is updated. */ +#undef DECLARE_EVENT_SYSCALL_CLASS +#define DECLARE_EVENT_SYSCALL_CLASS DECLARE_EVENT_CLASS + #undef DEFINE_EVENT #define DEFINE_EVENT(template, call, proto, args) \ static inline void ftrace_test_probe_##call(void) \ @@ -434,6 +459,9 @@ static struct trace_event_class __used __refdata event_class_##call = { \ _TRACE_PERF_INIT(call) \ }; +#undef DECLARE_EVENT_SYSCALL_CLASS +#define DECLARE_EVENT_SYSCALL_CLASS DECLARE_EVENT_CLASS + #undef DEFINE_EVENT #define DEFINE_EVENT(template, call, proto, args) \ \ diff --git a/kernel/entry/common.c b/kernel/entry/common.c index 5b6934e23c21..c9ac1c605d8b 100644 --- a/kernel/entry/common.c +++ b/kernel/entry/common.c @@ -58,7 +58,7 @@ long syscall_trace_enter(struct pt_regs *regs, long syscall, syscall = syscall_get_nr(current, regs); if (unlikely(work & SYSCALL_WORK_SYSCALL_TRACEPOINT)) { - trace_sys_enter(regs, syscall); + trace_syscall_sys_enter(regs, syscall); /* * Probes or BPF hooks in the tracepoint may have changed the * system call number as well. @@ -166,7 +166,7 @@ static void syscall_exit_work(struct pt_regs *regs, unsigned long work) audit_syscall_exit(regs); if (work & SYSCALL_WORK_SYSCALL_TRACEPOINT) - trace_sys_exit(regs, syscall_get_return_value(current, regs)); + trace_syscall_sys_exit(regs, syscall_get_return_value(current, regs)); step = report_single_step(work); if (step || work & SYSCALL_WORK_SYSCALL_TRACE) diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c index 785733245ead..67ac5366f724 100644 --- a/kernel/trace/trace_syscalls.c +++ b/kernel/trace/trace_syscalls.c @@ -377,7 +377,7 @@ static int reg_event_syscall_enter(struct trace_event_file *file, return -ENOSYS; mutex_lock(&syscall_trace_lock); if (!tr->sys_refcount_enter) - ret = register_trace_sys_enter(ftrace_syscall_enter, tr); + ret = register_trace_syscall_sys_enter(ftrace_syscall_enter, tr); if (!ret) { rcu_assign_pointer(tr->enter_syscall_files[num], file); tr->sys_refcount_enter++; @@ -399,7 +399,7 @@ static void unreg_event_syscall_enter(struct trace_event_file *file, tr->sys_refcount_enter--; RCU_INIT_POINTER(tr->enter_syscall_files[num], NULL); if (!tr->sys_refcount_enter) - unregister_trace_sys_enter(ftrace_syscall_enter, tr); + unregister_trace_syscall_sys_enter(ftrace_syscall_enter, tr); mutex_unlock(&syscall_trace_lock); } @@ -415,7 +415,7 @@ static int reg_event_syscall_exit(struct trace_event_file *file, return -ENOSYS; mutex_lock(&syscall_trace_lock); if (!tr->sys_refcount_exit) - ret = register_trace_sys_exit(ftrace_syscall_exit, tr); + ret = register_trace_syscall_sys_exit(ftrace_syscall_exit, tr); if (!ret) { rcu_assign_pointer(tr->exit_syscall_files[num], file); tr->sys_refcount_exit++; @@ -437,7 +437,7 @@ static void unreg_event_syscall_exit(struct trace_event_file *file, tr->sys_refcount_exit--; RCU_INIT_POINTER(tr->exit_syscall_files[num], NULL); if (!tr->sys_refcount_exit) - unregister_trace_sys_exit(ftrace_syscall_exit, tr); + unregister_trace_syscall_sys_exit(ftrace_syscall_exit, tr); mutex_unlock(&syscall_trace_lock); } @@ -633,7 +633,7 @@ static int perf_sysenter_enable(struct trace_event_call *call) mutex_lock(&syscall_trace_lock); if (!sys_perf_refcount_enter) - ret = register_trace_sys_enter(perf_syscall_enter, NULL); + ret = register_trace_syscall_sys_enter(perf_syscall_enter, NULL); if (ret) { pr_info("event trace: Could not activate syscall entry trace point"); } else { @@ -654,7 +654,7 @@ static void perf_sysenter_disable(struct trace_event_call *call) sys_perf_refcount_enter--; clear_bit(num, enabled_perf_enter_syscalls); if (!sys_perf_refcount_enter) - unregister_trace_sys_enter(perf_syscall_enter, NULL); + unregister_trace_syscall_sys_enter(perf_syscall_enter, NULL); mutex_unlock(&syscall_trace_lock); } @@ -732,7 +732,7 @@ static int perf_sysexit_enable(struct trace_event_call *call) mutex_lock(&syscall_trace_lock); if (!sys_perf_refcount_exit) - ret = register_trace_sys_exit(perf_syscall_exit, NULL); + ret = register_trace_syscall_sys_exit(perf_syscall_exit, NULL); if (ret) { pr_info("event trace: Could not activate syscall exit trace point"); } else { @@ -753,7 +753,7 @@ static void perf_sysexit_disable(struct trace_event_call *call) sys_perf_refcount_exit--; clear_bit(num, enabled_perf_exit_syscalls); if (!sys_perf_refcount_exit) - unregister_trace_sys_exit(perf_syscall_exit, NULL); + unregister_trace_syscall_sys_exit(perf_syscall_exit, NULL); mutex_unlock(&syscall_trace_lock); }
In preparation for allowing system call tracepoints to handle page faults, introduce TRACE_EVENT_SYSCALL to declare the sys_enter/sys_exit tracepoints. Emit the static inlines register_trace_syscall_##name for events declared with TRACE_EVENT_SYSCALL, allowing source-level validation that only probes meant to handle system call entry/exit events are registered to them. Move the common code between __DECLARE_TRACE and __DECLARE_TRACE_SYSCALL into __DECLARE_TRACE_COMMON. This change is not meant to alter the generated code, and only prepares the following modifications. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Michael Jeanson <mjeanson@efficios.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Yonghong Song <yhs@fb.com> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com> Cc: bpf@vger.kernel.org Cc: Joel Fernandes <joel@joelfernandes.org> --- Changes since v0: - Fix allnoconfig build by adding __DECLARE_TRACE_SYSCALL define in CONFIG_TRACEPOINTS=n case. - Rename unregister_trace_sys_{enter,exit} to unregister_trace_syscall_sys_{enter,exit} for symmetry with register. - Add emit trace_syscall_##name##_enabled for syscall tracepoints rather than trace_##name##_enabled, so it is in sync with the rest of the naming. --- include/linux/tracepoint.h | 83 ++++++++++++++++++++++++++++++--- include/trace/bpf_probe.h | 3 ++ include/trace/define_trace.h | 5 ++ include/trace/events/syscalls.h | 4 +- include/trace/perf.h | 3 ++ include/trace/trace_events.h | 28 +++++++++++ kernel/entry/common.c | 4 +- kernel/trace/trace_syscalls.c | 16 +++---- 8 files changed, 127 insertions(+), 19 deletions(-)