diff mbox series

[1/4] tracing: add error_report trace points

Message ID 20210113091657.1456216-2-glider@google.com (mailing list archive)
State New, archived
Headers show
Series [1/4] tracing: add error_report trace points | expand

Commit Message

Alexander Potapenko Jan. 13, 2021, 9:16 a.m. UTC
Introduce error_report_start and error_report_end tracepoints.
Those can be used in debugging tools like KASAN, KFENCE, etc.
to provide extensions to the error reporting mechanisms (e.g. allow
tests hook into error reporting, ease error report collection from
production kernels).
Another benefit would be making use of ftrace for debugging or
benchmarking the tools themselves.

Suggested-by: Marco Elver <elver@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Marco Elver <elver@google.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: linux-mm@kvack.org
Signed-off-by: Alexander Potapenko <glider@google.com>
---
 include/trace/events/error_report.h | 51 +++++++++++++++++++++++++++++
 kernel/trace/Makefile               |  1 +
 kernel/trace/error_report-traces.c  | 11 +++++++
 3 files changed, 63 insertions(+)
 create mode 100644 include/trace/events/error_report.h
 create mode 100644 kernel/trace/error_report-traces.c

Comments

Steven Rostedt Jan. 13, 2021, 9:10 p.m. UTC | #1
On Wed, 13 Jan 2021 10:16:54 +0100
Alexander Potapenko <glider@google.com> wrote:

> +DECLARE_EVENT_CLASS(error_report_template,
> +		    TP_PROTO(const char *error_detector, unsigned long id),

Instead of having a random string, as this should be used by a small finite
set of subsystems, why not make the above into an enum?

> +		    TP_ARGS(error_detector, id),
> +		    TP_STRUCT__entry(__field(const char *, error_detector)
> +					     __field(unsigned long, id)),
> +		    TP_fast_assign(__entry->error_detector = error_detector;
> +				   __entry->id = id;),
> +		    TP_printk("[%s] %lx", __entry->error_detector,

Then the [%s] portion of this could also be just a __print_symbolic().

-- Steve

> +			      __entry->id));
> +
> +/**
Alexander Potapenko Jan. 14, 2021, 7:49 a.m. UTC | #2
On Wed, Jan 13, 2021 at 10:10 PM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> On Wed, 13 Jan 2021 10:16:54 +0100
> Alexander Potapenko <glider@google.com> wrote:
>
> > +DECLARE_EVENT_CLASS(error_report_template,
> > +                 TP_PROTO(const char *error_detector, unsigned long id),
>
> Instead of having a random string, as this should be used by a small finite
> set of subsystems, why not make the above into an enum?

You're probably right.
I just thought it might be a good idea to minimize the effort needed
from tools' authors to add these tracepoints to the tools (see the
following two patches), and leave room for some extensibility (e.g.
passing bug type together with the tool name etc.)

> > +                 TP_ARGS(error_detector, id),
> > +                 TP_STRUCT__entry(__field(const char *, error_detector)
> > +                                          __field(unsigned long, id)),
> > +                 TP_fast_assign(__entry->error_detector = error_detector;
> > +                                __entry->id = id;),
> > +                 TP_printk("[%s] %lx", __entry->error_detector,
>
> Then the [%s] portion of this could also be just a __print_symbolic().

We'll need to explicitly list the enum values once again in
__print_symbolic(), right? E.g.:

enum debugging_tool {
         TOOL_KFENCE,
         TOOL_KASAN,
         ...
}

TP_printk(__print_symbolic(__entry->error_detector, TOOL_KFENCE,
TOOL_KASAN, ...),
Marco Elver Jan. 14, 2021, 8:59 a.m. UTC | #3
On Thu, 14 Jan 2021 at 08:50, Alexander Potapenko <glider@google.com> wrote:
>
> On Wed, Jan 13, 2021 at 10:10 PM Steven Rostedt <rostedt@goodmis.org> wrote:
> >
> > On Wed, 13 Jan 2021 10:16:54 +0100
> > Alexander Potapenko <glider@google.com> wrote:
> >
> > > +DECLARE_EVENT_CLASS(error_report_template,
> > > +                 TP_PROTO(const char *error_detector, unsigned long id),
> >
> > Instead of having a random string, as this should be used by a small finite
> > set of subsystems, why not make the above into an enum?
>
> You're probably right.
> I just thought it might be a good idea to minimize the effort needed
> from tools' authors to add these tracepoints to the tools (see the
> following two patches), and leave room for some extensibility (e.g.
> passing bug type together with the tool name etc.)
>
> > > +                 TP_ARGS(error_detector, id),
> > > +                 TP_STRUCT__entry(__field(const char *, error_detector)
> > > +                                          __field(unsigned long, id)),
> > > +                 TP_fast_assign(__entry->error_detector = error_detector;
> > > +                                __entry->id = id;),
> > > +                 TP_printk("[%s] %lx", __entry->error_detector,
> >
> > Then the [%s] portion of this could also be just a __print_symbolic().
>
> We'll need to explicitly list the enum values once again in
> __print_symbolic(), right? E.g.:
>
> enum debugging_tool {

We need to use TRACE_DEFINE_ENUM().

>          TOOL_KFENCE,

For consistency I would call the enum simply "ERROR_DETECTOR" as well.
(Hypothetically, there could also be an "error detector" that is not a
"debugging tool".)

>          TOOL_KASAN,
>          ...
> }
>
> TP_printk(__print_symbolic(__entry->error_detector, TOOL_KFENCE,
> TOOL_KASAN, ...),

It takes a list of val -> str. E.g.
__print_symbolic(__entry->error_detector, { TOOL_KFENCE, "KFENCE" }, {
TOOL_KASAN, "KASAN" }).

Looking around the kernel, sometimes this is simplified with macros,
but not sure if it's worth it. Typing the same thing 3 times is fine,
given this list won't grow really fast.

Thanks,
-- Marco
Steven Rostedt Jan. 14, 2021, 2:52 p.m. UTC | #4
On Thu, 14 Jan 2021 08:49:57 +0100
Alexander Potapenko <glider@google.com> wrote:

> We'll need to explicitly list the enum values once again in
> __print_symbolic(), right? E.g.:
> 
> enum debugging_tool {
>          TOOL_KFENCE,
>          TOOL_KASAN,
>          ...
> }
> 
> TP_printk(__print_symbolic(__entry->error_detector, TOOL_KFENCE,
> TOOL_KASAN, ...),

Usually what is done is to make this into a macro:

#define REPORT_TOOL_LIST \
  EM(KFENCE, kfence) \
  EMe(KASAN, kasan)

#undef EM
#undef EMe

#define EM(a,b) TRACE_DEFINE_ENUM(a)
#define EMe(a,b) TRACE_DEFINE_ENUM(a)

REPORT_TOOL_LIST

#undef EM
#undef EMe

#define EM(a, b) { a, b },
#define EMe(a, b) { a, b }

#define show_report_tool_list(val) \
	__print_symbolic(val, REPORT_TOOL_LIST)


[..]

 TP_printk("[%s] %lx", show_report_tool_list(__entry->error_detector),
    __entry->id)


This is done in several other trace event files.  For example, see
  include/trace/events/sock.h


-- Steve
Alexander Potapenko Jan. 15, 2021, 12:53 p.m. UTC | #5
On Thu, Jan 14, 2021 at 3:52 PM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> On Thu, 14 Jan 2021 08:49:57 +0100
> Alexander Potapenko <glider@google.com> wrote:
>
> > We'll need to explicitly list the enum values once again in
> > __print_symbolic(), right? E.g.:
> >
> > enum debugging_tool {
> >          TOOL_KFENCE,
> >          TOOL_KASAN,
> >          ...
> > }
> >
> > TP_printk(__print_symbolic(__entry->error_detector, TOOL_KFENCE,
> > TOOL_KASAN, ...),
>
> Usually what is done is to make this into a macro:
>
> #define REPORT_TOOL_LIST \
>   EM(KFENCE, kfence) \
>   EMe(KASAN, kasan)

Thanks, will be done in v2!
Note that checkpatch doesn't really like this declaration style,
claiming that "Macros with complex values should be enclosed in
parentheses".
(although it is consistent with what's done in other trace event headers)

>
> #define EM(a,b) TRACE_DEFINE_ENUM(a)
> #define EMe(a,b) TRACE_DEFINE_ENUM(a)

These lines must end with a semicolon, according to other headers (and
that becomes yet another thing that checkpatch barks at).
Steven Rostedt Jan. 15, 2021, 4:17 p.m. UTC | #6
On Fri, 15 Jan 2021 13:53:19 +0100
Alexander Potapenko <glider@google.com> wrote:

> > #define REPORT_TOOL_LIST \
> >   EM(KFENCE, kfence) \
> >   EMe(KASAN, kasan)  
> 
> Thanks, will be done in v2!
> Note that checkpatch doesn't really like this declaration style,
> claiming that "Macros with complex values should be enclosed in
> parentheses".
> (although it is consistent with what's done in other trace event headers)

checkpatch.pl hates most of the tracing macro code ;-)

It's the one place that you can mostly ignore checkpatch reports.

-- Steve
diff mbox series

Patch

diff --git a/include/trace/events/error_report.h b/include/trace/events/error_report.h
new file mode 100644
index 000000000000..bae0129031f0
--- /dev/null
+++ b/include/trace/events/error_report.h
@@ -0,0 +1,51 @@ 
+/* SPDX-License-Identifier: GPL-2.0-only */
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM error_report
+
+#if !defined(_TRACE_ERROR_REPORT_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_ERROR_REPORT_H
+
+#include <linux/tracepoint.h>
+
+DECLARE_EVENT_CLASS(error_report_template,
+		    TP_PROTO(const char *error_detector, unsigned long id),
+		    TP_ARGS(error_detector, id),
+		    TP_STRUCT__entry(__field(const char *, error_detector)
+					     __field(unsigned long, id)),
+		    TP_fast_assign(__entry->error_detector = error_detector;
+				   __entry->id = id;),
+		    TP_printk("[%s] %lx", __entry->error_detector,
+			      __entry->id));
+
+/**
+ * error_report_start - called before printing the error report
+ * @error_detector:	short string describing the error detection tool
+ * @id:			pseudo-unique descriptor that can help distinguish reports
+ *			from one another. Depending on the tool, good examples
+ *			could be: memory access address, call site, allocation
+ *			site, etc.
+ *
+ * This event occurs right before a debugging tool starts printing the error
+ * report.
+ */
+DEFINE_EVENT(error_report_template, error_report_start,
+	     TP_PROTO(const char *error_detector, unsigned long id),
+	     TP_ARGS(error_detector, id));
+
+/**
+ * error_report_end - called after printing the error report
+ * @error_detector:	short string describing the error detection tool
+ * @id:			pseudo-unique descriptor, matches that passed to
+ *			error_report_start
+ *
+ * This event occurs right after a debugging tool finishes printing the error
+ * report.
+ */
+DEFINE_EVENT(error_report_template, error_report_end,
+	     TP_PROTO(const char *error_detector, unsigned long id),
+	     TP_ARGS(error_detector, id));
+
+#endif /* _TRACE_ERROR_REPORT_H */
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile
index 7e44cea89fdc..b28d3e5013cd 100644
--- a/kernel/trace/Makefile
+++ b/kernel/trace/Makefile
@@ -81,6 +81,7 @@  obj-$(CONFIG_SYNTH_EVENTS) += trace_events_synth.o
 obj-$(CONFIG_HIST_TRIGGERS) += trace_events_hist.o
 obj-$(CONFIG_BPF_EVENTS) += bpf_trace.o
 obj-$(CONFIG_KPROBE_EVENTS) += trace_kprobe.o
+obj-$(CONFIG_TRACEPOINTS) += error_report-traces.o
 obj-$(CONFIG_TRACEPOINTS) += power-traces.o
 ifeq ($(CONFIG_PM),y)
 obj-$(CONFIG_TRACEPOINTS) += rpm-traces.o
diff --git a/kernel/trace/error_report-traces.c b/kernel/trace/error_report-traces.c
new file mode 100644
index 000000000000..80960c52c705
--- /dev/null
+++ b/kernel/trace/error_report-traces.c
@@ -0,0 +1,11 @@ 
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Error reporting trace points
+ */
+
+#define CREATE_TRACE_POINTS
+#include <trace/events/error_report.h>
+
+EXPORT_TRACEPOINT_SYMBOL_GPL(error_report_start);
+EXPORT_TRACEPOINT_SYMBOL_GPL(error_report_end);
+