diff mbox

[V2,1/4] trace: Introduce an output interface from ftrace to STM

Message ID 1466563613-31578-2-git-send-email-zhang.chunyan@linaro.org (mailing list archive)
State New, archived
Headers show

Commit Message

Chunyan Zhang June 22, 2016, 2:46 a.m. UTC
This patch is introducing a new output to print Ftrace messages
to STM buffer when the traces happen.  In order to reduce the
effect on timing overhead as much as possible, only the current
function and its parent's instruction pointer will be recorded
into STM. This idea was first introduced by Philippe Langlais
at ST-Microelectronics a long time ago[1].

[1] http://thread.gmane.org/gmane.linux.kernel/1114274/focus=1114275

Signed-off-by: Chunyan Zhang <zhang.chunyan@linaro.org>
---
 include/linux/stm.h              |  7 +++++++
 include/linux/trace_output_stm.h | 17 +++++++++++++++++
 kernel/trace/Makefile            |  1 +
 kernel/trace/trace_output_stm.c  | 41 ++++++++++++++++++++++++++++++++++++++++
 4 files changed, 66 insertions(+)
 create mode 100644 include/linux/trace_output_stm.h
 create mode 100644 kernel/trace/trace_output_stm.c

Comments

Alexander Shishkin June 22, 2016, 5:56 a.m. UTC | #1
[adding Felipe for his sudden interest in the subject matter]

Chunyan Zhang <zhang.chunyan@linaro.org> writes:

> +static struct stm_ftrace *trace_output;

What you want is a possibility to have different ftrace outputs, not
different STM outputs for ftrace (again, STM core already does this).
In other words, here, you want to have the notion of "output" be
stm-agnostic, but be a generalized output driver object.

> +
> +void trace_func_to_stm(unsigned long ip, unsigned long parent_ip)
> +{
> +	unsigned long ip_array[2] = {ip, parent_ip};
> +
> +	if (trace_output)
> +		trace_output->write(&trace_output->data, (char *)ip_array,
> +				    sizeof(unsigned long) * 2, STM_FTRACE_CHAN);
> +}

The ip+parent_ip pair is still not a useful output from ftrace
data. Moreover, doing this is basically like inventing another binary
protocol for ftrace data over stm, where ftrace is in and of itself
already a binary protocol, why not just use that? The decoder will
basically depend on the kernel binary from whence the traces are
coming, but this is a requirement even if we want to decypher the
ip+parent_ip data you're proposing.

We would need to take some time to think this through. What we might
consider is:

 * bypassing ftrace ring buffer, sending data directly to an "output",
   which has a drawback of ending up in a driver callback, which needs
   to serialize on its driver stuff and write registers (I did try to
   make stm_write as light as possible when I wrote it, though); the
   good part is that data goes into the wire as soon as it is produced
   instead of being buffered along the way;
 * starting a work (or multiple works) that would traverse new data in
   ftrace buffer and feed it to an "output", such as stm; this has a
   problem of producers being potentially faster than consumers
   (consider 'function' tracer, for example) and hogging the cpus by
   simply exporting trace data; this also botches the stm timestamps,
   which will then be representative of nothing in particular.

> +
> +void trace_add_output(struct stm_ftrace *stm)
> +{
> +	trace_output = stm;
> +}
> +EXPORT_SYMBOL_GPL(trace_add_output);
> +
> +void trace_rm_output(void)
> +{
> +	trace_output = NULL;
> +}
> +EXPORT_SYMBOL_GPL(trace_rm_output);

These, of course, only work because they are implicitly serialized on
stm core's link locks.

Regards,
--
Alex
Chunyan Zhang June 28, 2016, 6:59 a.m. UTC | #2
On Wed, Jun 22, 2016 at 1:56 PM, Alexander Shishkin
<alexander.shishkin@linux.intel.com> wrote:
> [adding Felipe for his sudden interest in the subject matter]
>
> Chunyan Zhang <zhang.chunyan@linaro.org> writes:
>
>> +static struct stm_ftrace *trace_output;
>
> What you want is a possibility to have different ftrace outputs, not
> different STM outputs for ftrace (again, STM core already does this).
> In other words, here, you want to have the notion of "output" be
> stm-agnostic, but be a generalized output driver object.

Ok, I will make it more generic, i.e. an agnostic of the exact output devices.

>
>> +
>> +void trace_func_to_stm(unsigned long ip, unsigned long parent_ip)
>> +{
>> +     unsigned long ip_array[2] = {ip, parent_ip};
>> +
>> +     if (trace_output)
>> +             trace_output->write(&trace_output->data, (char *)ip_array,
>> +                                 sizeof(unsigned long) * 2, STM_FTRACE_CHAN);
>> +}
>
> The ip+parent_ip pair is still not a useful output from ftrace
> data. Moreover, doing this is basically like inventing another binary
> protocol for ftrace data over stm, where ftrace is in and of itself
> already a binary protocol, why not just use that? The decoder will
> basically depend on the kernel binary from whence the traces are
> coming, but this is a requirement even if we want to decypher the
> ip+parent_ip data you're proposing.
>
> We would need to take some time to think this through. What we might
> consider is:
>
>  * bypassing ftrace ring buffer, sending data directly to an "output",
>    which has a drawback of ending up in a driver callback, which needs
>    to serialize on its driver stuff and write registers (I did try to
>    make stm_write as light as possible when I wrote it, though); the
>    good part is that data goes into the wire as soon as it is produced
>    instead of being buffered along the way;

I want to try this way, if I remember correctly, Steven Rostedt said
the similar solution before, use a jump label to switch which "output"
traces should be sent to.  And yes like you said, the drawback is
exporting trace data to STM by writing registers is slower than
writing ring buffer, it would slow down the process.

>  * starting a work (or multiple works) that would traverse new data in
>    ftrace buffer and feed it to an "output", such as stm; this has a
>    problem of producers being potentially faster than consumers
>    (consider 'function' tracer, for example) and hogging the cpus by
>    simply exporting trace data; this also botches the stm timestamps,

With stm timestamps on each trace data, we can sync up every events
happened on the system, not only on CPU but also on other processors,
this is useful if only the timestamps are nearly real-time, from this
view, the second way seems not so more feasible than the first one.

Please correct me if I'm missing something.

Thanks for your comments,
Chunyan

>    which will then be representative of nothing in particular.
>
>> +
>> +void trace_add_output(struct stm_ftrace *stm)
>> +{
>> +     trace_output = stm;
>> +}
>> +EXPORT_SYMBOL_GPL(trace_add_output);
>> +
>> +void trace_rm_output(void)
>> +{
>> +     trace_output = NULL;
>> +}
>> +EXPORT_SYMBOL_GPL(trace_rm_output);
>
> These, of course, only work because they are implicitly serialized on
> stm core's link locks.
>
> Regards,
> --
> Alex
diff mbox

Patch

diff --git a/include/linux/stm.h b/include/linux/stm.h
index 8369d8a..2b6639a 100644
--- a/include/linux/stm.h
+++ b/include/linux/stm.h
@@ -129,6 +129,13 @@  struct stm_source_data {
 	void			(*unlink)(struct stm_source_data *data);
 };
 
+struct stm_ftrace {
+	struct stm_source_data	data;
+	void (*write)(struct stm_source_data *data, const char *buf,
+		      unsigned int len, unsigned int chan);
+	bool available;
+};
+
 int stm_source_register_device(struct device *parent,
 			       struct stm_source_data *data);
 void stm_source_unregister_device(struct stm_source_data *data);
diff --git a/include/linux/trace_output_stm.h b/include/linux/trace_output_stm.h
new file mode 100644
index 0000000..7edc680
--- /dev/null
+++ b/include/linux/trace_output_stm.h
@@ -0,0 +1,17 @@ 
+#ifndef __TRACE_OUTPUT_STM_H
+#define __TRACE_OUTPUT_STM_H
+
+#include <linux/module.h>
+
+#if IS_ENABLED(CONFIG_STM_SOURCE_FTRACE)
+struct stm_ftrace;
+extern void
+trace_func_to_stm(unsigned long ip, unsigned long parent_ip);
+extern void trace_add_output(struct stm_ftrace *stm);
+extern void trace_rm_output(void);
+#else
+static inline void
+trace_func_to_stm(unsigned long ip, unsigned long parent_ip) {}
+#endif
+
+#endif /* __TRACE_OUTPUT_STM_H */
diff --git a/kernel/trace/Makefile b/kernel/trace/Makefile
index 979e7bf..5330803 100644
--- a/kernel/trace/Makefile
+++ b/kernel/trace/Makefile
@@ -69,4 +69,5 @@  obj-$(CONFIG_UPROBE_EVENT) += trace_uprobe.o
 
 obj-$(CONFIG_TRACEPOINT_BENCHMARK) += trace_benchmark.o
 
+obj-$(CONFIG_TRACING) += trace_output_stm.o
 libftrace-y := ftrace.o
diff --git a/kernel/trace/trace_output_stm.c b/kernel/trace/trace_output_stm.c
new file mode 100644
index 0000000..dc04952
--- /dev/null
+++ b/kernel/trace/trace_output_stm.c
@@ -0,0 +1,41 @@ 
+/*
+ * Output interface from Ftrace to STM buffer
+ * Copyright (c) 2016, Linaro Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ */
+
+#include <linux/stm.h>
+
+/* Offset above the start channel number */
+#define STM_FTRACE_CHAN 0
+
+static struct stm_ftrace *trace_output;
+
+void trace_func_to_stm(unsigned long ip, unsigned long parent_ip)
+{
+	unsigned long ip_array[2] = {ip, parent_ip};
+
+	if (trace_output)
+		trace_output->write(&trace_output->data, (char *)ip_array,
+				    sizeof(unsigned long) * 2, STM_FTRACE_CHAN);
+}
+
+void trace_add_output(struct stm_ftrace *stm)
+{
+	trace_output = stm;
+}
+EXPORT_SYMBOL_GPL(trace_add_output);
+
+void trace_rm_output(void)
+{
+	trace_output = NULL;
+}
+EXPORT_SYMBOL_GPL(trace_rm_output);