diff mbox series

[v3] libtracefs: Add APIs for data streaming

Message ID 20210624145101.87233-1-y.karadz@gmail.com (mailing list archive)
State Superseded
Headers show
Series [v3] libtracefs: Add APIs for data streaming | expand

Commit Message

Yordan Karadzhov June 24, 2021, 2:51 p.m. UTC
The new APIs can be used to dump the content of "trace_pipe" into a
file or directly to "stdout". The "splice" system call is used to
moves the data without copying. The new functionality is essentially
identical to what 'trace-cmd show -p' does.

Signed-off-by: Yordan Karadzhov (VMware) <y.karadz@gmail.com>
---
 Documentation/libtracefs-stream.txt | 106 ++++++++++++++++++++++++++++
 include/tracefs-local.h             |   1 +
 include/tracefs.h                   |   5 ++
 src/tracefs-tools.c                 |  99 ++++++++++++++++++++++++++
 4 files changed, 211 insertions(+)
 create mode 100644 Documentation/libtracefs-stream.txt

Comments

Steven Rostedt June 24, 2021, 4:45 p.m. UTC | #1
On Thu, 24 Jun 2021 17:51:01 +0300
"Yordan Karadzhov (VMware)" <y.karadz@gmail.com> wrote:

> The new APIs can be used to dump the content of "trace_pipe" into a
> file or directly to "stdout". The "splice" system call is used to
> moves the data without copying. The new functionality is essentially
> identical to what 'trace-cmd show -p' does.
> 
> Signed-off-by: Yordan Karadzhov (VMware) <y.karadz@gmail.com>
> 

So I did some investigations here, and found that splice() is not
guaranteed to work on a terminal output.

To overcome this, I decided to have the print create another pipe, and
pass that to the stream code. The if something was read, it would read
the pipe.

It required changing the return of the functions to return the amount
transferred, so that I can differentiate between "EOF" with nothing
read, and something read. Because I couldn't get the pipe to not block
on read if there was no content in it. :-/

Anyway, I submitted a v4 with the changes I made, and it appears to
work. I haven't tested it that much, so it may still have bugs.

We could optimize it to just use the splice output first (stream
first), and if that returns with EINVAL, which is the expected error on
a non functioning stdout, we could then switch to this alternate method.

Although, the stream would have read some of the pipe and lost the
data, which is not what we want, so a test of stdout would need to be
made with splice first. :-/

-- Steve
Yordan Karadzhov June 25, 2021, 1:17 p.m. UTC | #2
On 24.06.21 г. 19:45, Steven Rostedt wrote:
> On Thu, 24 Jun 2021 17:51:01 +0300
> "Yordan Karadzhov (VMware)" <y.karadz@gmail.com> wrote:
> 
>> The new APIs can be used to dump the content of "trace_pipe" into a
>> file or directly to "stdout". The "splice" system call is used to
>> moves the data without copying. The new functionality is essentially
>> identical to what 'trace-cmd show -p' does.
>>
>> Signed-off-by: Yordan Karadzhov (VMware) <y.karadz@gmail.com>
>>
> 
> So I did some investigations here, and found that splice() is not
> guaranteed to work on a terminal output.
> 
> To overcome this, I decided to have the print create another pipe, and
> pass that to the stream code. The if something was read, it would read
> the pipe.
> 
> It required changing the return of the functions to return the amount
> transferred, so that I can differentiate between "EOF" with nothing
> read, and something read. Because I couldn't get the pipe to not block
> on read if there was no content in it. :-/
> 
> Anyway, I submitted a v4 with the changes I made, and it appears to
> work. I haven't tested it that much, so it may still have bugs.
> 

Hi Steve,

I tested v4 of the patch that you submitted and it doesn't work correct in my use case. Also it seems greatly 
over-complicated to me. Note that when you call tracefs_trace_pipe_stream() inside a loop you will keep opening and 
closing the "trace_pipe" and the pipe over and over again.

See the example code below. Is this what you are aiming? Also is this guaranteed to always work?

Thanks!
Yordan


#define _GNU_SOURCE
#include <fcntl.h>
#include <unistd.h>
#include <signal.h>
#include <stdbool.h>
#include <stdio.h>
#include <errno.h>

volatile bool keep_going = true;

static void finish(int sig)
{
	keep_going = false;
}

int main(int argc, char **argv)
{
	int splice_flags = SPLICE_F_MORE | SPLICE_F_MOVE;
	int brass1[2], brass2[2], fd, ret;
	char buf[BUFSIZ];

	signal(SIGINT, finish);

	fd = open(argv[1], O_RDONLY);
	if (fd < 0) {
		perror("open");
		return 1;
	}

	if (pipe(brass1) < 0){
		perror("pipe A");
		return 1;
	}

	if (pipe(brass2) < 0){
		perror("pipe B");
		return 1;
	}

	errno = 0;
	while (keep_going) {
		ret = splice(fd, NULL,
			     brass1[1], NULL,
			     BUFSIZ, splice_flags);
		if (ret < 0) {
			perror("splice A");
			return 1;
		}
		ret = splice(brass1[0], NULL,
			     brass2[1], NULL,
			     BUFSIZ, splice_flags);
		if (ret < 0) {
			perror("splice B");
			return 1;
		}

		ret = read(brass2[0], buf, BUFSIZ);
		if (ret < 0) {
			perror("read");
			return 1;
		}

		ret = write(STDOUT_FILENO, buf, ret);
		if (ret < 0) {
			perror("write");
			return 1;
		}
	}

	close(fd);
	close(brass1[0]);
	close(brass1[1]);
	close(brass2[0]);
	close(brass2[1]);

	return 0;
}



> We could optimize it to just use the splice output first (stream
> first), and if that returns with EINVAL, which is the expected error on
> a non functioning stdout, we could then switch to this alternate method.
> 
> Although, the stream would have read some of the pipe and lost the
> data, which is not what we want, so a test of stdout would need to be
> made with splice first. :-/
> 
> -- Steve
>
Steven Rostedt June 25, 2021, 1:46 p.m. UTC | #3
On Fri, 25 Jun 2021 16:17:14 +0300
Yordan Karadzhov <y.karadz@gmail.com> wrote:

> On 24.06.21 г. 19:45, Steven Rostedt wrote:
> > On Thu, 24 Jun 2021 17:51:01 +0300
> > "Yordan Karadzhov (VMware)" <y.karadz@gmail.com> wrote:
> >   
> >> The new APIs can be used to dump the content of "trace_pipe" into a
> >> file or directly to "stdout". The "splice" system call is used to
> >> moves the data without copying. The new functionality is essentially
> >> identical to what 'trace-cmd show -p' does.
> >>
> >> Signed-off-by: Yordan Karadzhov (VMware) <y.karadz@gmail.com>
> >>  
> > 
> > So I did some investigations here, and found that splice() is not
> > guaranteed to work on a terminal output.
> > 
> > To overcome this, I decided to have the print create another pipe, and
> > pass that to the stream code. The if something was read, it would read
> > the pipe.
> > 
> > It required changing the return of the functions to return the amount
> > transferred, so that I can differentiate between "EOF" with nothing
> > read, and something read. Because I couldn't get the pipe to not block
> > on read if there was no content in it. :-/
> > 
> > Anyway, I submitted a v4 with the changes I made, and it appears to
> > work. I haven't tested it that much, so it may still have bugs.
> >   
> 
> Hi Steve,
> 
> I tested v4 of the patch that you submitted and it doesn't work correct in my use case. Also it seems greatly 
> over-complicated to me. Note that when you call tracefs_trace_pipe_stream() inside a loop you will keep opening and 
> closing the "trace_pipe" and the pipe over and over again.

Good point. I could update it to pull out the main work of the splice
logic and have both functions use that, and not have one call the other.

What didn't work with your use case?

> 
> See the example code below. Is this what you are aiming? Also is this guaranteed to always work?

Similar but I'll comment below. Note, the real issue I had in testing
my code was blocking. The read on the pipe would block on EOF even with
a NONBLOCK flag :-/ Which is not what I wanted, and to fix that, it
made the code a bit more complex.

> 
> Thanks!
> Yordan
> 
> 
> #define _GNU_SOURCE
> #include <fcntl.h>
> #include <unistd.h>
> #include <signal.h>
> #include <stdbool.h>
> #include <stdio.h>
> #include <errno.h>
> 
> volatile bool keep_going = true;
> 
> static void finish(int sig)
> {
> 	keep_going = false;
> }
> 
> int main(int argc, char **argv)
> {
> 	int splice_flags = SPLICE_F_MORE | SPLICE_F_MOVE;
> 	int brass1[2], brass2[2], fd, ret;
> 	char buf[BUFSIZ];
> 
> 	signal(SIGINT, finish);
> 
> 	fd = open(argv[1], O_RDONLY);
> 	if (fd < 0) {
> 		perror("open");
> 		return 1;
> 	}
> 
> 	if (pipe(brass1) < 0){
> 		perror("pipe A");
> 		return 1;
> 	}
> 
> 	if (pipe(brass2) < 0){
> 		perror("pipe B");
> 		return 1;
> 	}
> 
> 	errno = 0;
> 	while (keep_going) {
> 		ret = splice(fd, NULL,
> 			     brass1[1], NULL,
> 			     BUFSIZ, splice_flags);

The above can block, and if it does, and you hit "ctrl-C" it will
return with ret < 0 and have errno set to EINTR.

> 		if (ret < 0) {

With a Ctrl-C, this would error.

> 			perror("splice A");
> 			return 1;
> 		}
> 		ret = splice(brass1[0], NULL,
> 			     brass2[1], NULL,
> 			     BUFSIZ, splice_flags);
> 		if (ret < 0) {
> 			perror("splice B");
> 			return 1;
> 		}
> 
> 		ret = read(brass2[0], buf, BUFSIZ);

Same here, with the blocking issue.

-- Steve

> 		if (ret < 0) {
> 			perror("read");
> 			return 1;
> 		}
> 
> 		ret = write(STDOUT_FILENO, buf, ret);
> 		if (ret < 0) {
> 			perror("write");
> 			return 1;
> 		}
> 	}
> 
> 	close(fd);
> 	close(brass1[0]);
> 	close(brass1[1]);
> 	close(brass2[0]);
> 	close(brass2[1]);
> 
> 	return 0;
> }
Yordan Karadzhov June 25, 2021, 1:57 p.m. UTC | #4
On 25.06.21 г. 16:46, Steven Rostedt wrote:
> On Fri, 25 Jun 2021 16:17:14 +0300
> Yordan Karadzhov <y.karadz@gmail.com> wrote:
> 
>> On 24.06.21 г. 19:45, Steven Rostedt wrote:
>>> On Thu, 24 Jun 2021 17:51:01 +0300
>>> "Yordan Karadzhov (VMware)" <y.karadz@gmail.com> wrote:
>>>    
>>>> The new APIs can be used to dump the content of "trace_pipe" into a
>>>> file or directly to "stdout". The "splice" system call is used to
>>>> moves the data without copying. The new functionality is essentially
>>>> identical to what 'trace-cmd show -p' does.
>>>>
>>>> Signed-off-by: Yordan Karadzhov (VMware) <y.karadz@gmail.com>
>>>>   
>>>
>>> So I did some investigations here, and found that splice() is not
>>> guaranteed to work on a terminal output.
>>>
>>> To overcome this, I decided to have the print create another pipe, and
>>> pass that to the stream code. The if something was read, it would read
>>> the pipe.
>>>
>>> It required changing the return of the functions to return the amount
>>> transferred, so that I can differentiate between "EOF" with nothing
>>> read, and something read. Because I couldn't get the pipe to not block
>>> on read if there was no content in it. :-/
>>>
>>> Anyway, I submitted a v4 with the changes I made, and it appears to
>>> work. I haven't tested it that much, so it may still have bugs.
>>>    
>>
>> Hi Steve,
>>
>> I tested v4 of the patch that you submitted and it doesn't work correct in my use case. Also it seems greatly
>> over-complicated to me. Note that when you call tracefs_trace_pipe_stream() inside a loop you will keep opening and
>> closing the "trace_pipe" and the pipe over and over again.
> 
> Good point. I could update it to pull out the main work of the splice
> logic and have both functions use that, and not have one call the other.
> 
> What didn't work with your use case?
> 
>>
>> See the example code below. Is this what you are aiming? Also is this guaranteed to always work?
> 
> Similar but I'll comment below. Note, the real issue I had in testing
> my code was blocking. The read on the pipe would block on EOF even with
> a NONBLOCK flag :-/ Which is not what I wanted, and to fix that, it
> made the code a bit more complex.
> 
>>
>> Thanks!
>> Yordan
>>
>>
>> #define _GNU_SOURCE
>> #include <fcntl.h>
>> #include <unistd.h>
>> #include <signal.h>
>> #include <stdbool.h>
>> #include <stdio.h>
>> #include <errno.h>
>>
>> volatile bool keep_going = true;
>>
>> static void finish(int sig)
>> {
>> 	keep_going = false;
>> }
>>
>> int main(int argc, char **argv)
>> {
>> 	int splice_flags = SPLICE_F_MORE | SPLICE_F_MOVE;
>> 	int brass1[2], brass2[2], fd, ret;
>> 	char buf[BUFSIZ];
>>
>> 	signal(SIGINT, finish);
>>
>> 	fd = open(argv[1], O_RDONLY);
>> 	if (fd < 0) {
>> 		perror("open");
>> 		return 1;
>> 	}
>>
>> 	if (pipe(brass1) < 0){
>> 		perror("pipe A");
>> 		return 1;
>> 	}
>>
>> 	if (pipe(brass2) < 0){
>> 		perror("pipe B");
>> 		return 1;
>> 	}
>>
>> 	errno = 0;
>> 	while (keep_going) {
>> 		ret = splice(fd, NULL,
>> 			     brass1[1], NULL,
>> 			     BUFSIZ, splice_flags);
> 
> The above can block, and if it does, and you hit "ctrl-C" it will
> return with ret < 0 and have errno set to EINTR.
> 
>> 		if (ret < 0) {
> 
> With a Ctrl-C, this would error.

Yes, Isn't this what we want. We want to stop the loop.

Thanks!
Y.

> 
>> 			perror("splice A");
>> 			return 1;
>> 		}
>> 		ret = splice(brass1[0], NULL,
>> 			     brass2[1], NULL,
>> 			     BUFSIZ, splice_flags);
>> 		if (ret < 0) {
>> 			perror("splice B");
>> 			return 1;
>> 		}
>>
>> 		ret = read(brass2[0], buf, BUFSIZ);
> 
> Same here, with the blocking issue.
> 
> -- Steve
> 
>> 		if (ret < 0) {
>> 			perror("read");
>> 			return 1;
>> 		}
>>
>> 		ret = write(STDOUT_FILENO, buf, ret);
>> 		if (ret < 0) {
>> 			perror("write");
>> 			return 1;
>> 		}
>> 	}
>>
>> 	close(fd);
>> 	close(brass1[0]);
>> 	close(brass1[1]);
>> 	close(brass2[0]);
>> 	close(brass2[1]);
>>
>> 	return 0;
>> }
diff mbox series

Patch

diff --git a/Documentation/libtracefs-stream.txt b/Documentation/libtracefs-stream.txt
new file mode 100644
index 0000000..24c2e47
--- /dev/null
+++ b/Documentation/libtracefs-stream.txt
@@ -0,0 +1,106 @@ 
+libtracefs(3)
+=============
+
+NAME
+----
+tracefs_trace_pipe_stream, tracefs_trace_pipe_print -
+redirect the stream of trace data to an output or stdout.
+
+SYNOPSIS
+--------
+[verse]
+--
+*#include <tracefs.h>*
+
+int tracefs_trace_pipe_stream(int fd, struct tracefs_instance *instance, int flags);
+int tracefs_trace_pipe_print(struct tracefs_instance *instance);
+
+--
+
+DESCRIPTION
+-----------
+If NULL is passed as _instance_, the top trace instance is used.
+The user can interrupt the streaming of the data by pressing Ctrl-c.
+
+The _tracefs_trace_pipe_stream()_ function redirects the stream of trace data to an output
+file. The "splice" system call is used to moves the data without copying between kernel
+address space and user address space. The _fd_ is the file descriptor of the output file
+and _flags_ is a bit mask of flags to be passed to the "splice" system call.
+
+The _tracefs_trace_pipe_print()_ function is similar to _tracefs_trace_pipe_stream()_, but
+the stream of trace data is redirected to stdout.
+
+
+RETURN VALUE
+------------
+The _tracefs_trace_pipe_stream()_, and _tracefs_trace_pipe_print()_ functions return 0 if the operation is
+successful, or -1 in case of an error.
+
+EXAMPLE
+-------
+[source,c]
+--
+#include <unistd.h>
+#include <signal.h>
+
+#include <tracefs.h>
+
+void stop(int sig)
+{
+	tracefs_trace_pipe_stop(NULL);
+}
+
+int main()
+{
+	mode_t mode = S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH;
+	const char *filename = "trace.txt";
+	int fd = creat(filename, mode);
+	int ret;
+
+	signal(SIGINT, stop);
+	ret = tracefs_trace_pipe_stream(fd, NULL, 0);
+	close(fd);
+
+	return ret;
+}
+--
+FILES
+-----
+[verse]
+--
+*tracefs.h*
+	Header file to include in order to have access to the library APIs.
+*-ltracefs*
+	Linker switch to add when building a program that uses the library.
+--
+
+SEE ALSO
+--------
+_libtracefs(3)_,
+_libtraceevent(3)_,
+_trace-cmd(1)_,
+Documentation/trace/ftrace.rst from the Linux kernel tree
+
+AUTHOR
+------
+[verse]
+--
+*Steven Rostedt* <rostedt@goodmis.org>
+*Tzvetomir Stoyanov* <tz.stoyanov@gmail.com>
+--
+REPORTING BUGS
+--------------
+Report bugs to  <linux-trace-devel@vger.kernel.org>
+
+LICENSE
+-------
+libtracefs is Free Software licensed under the GNU LGPL 2.1
+
+RESOURCES
+---------
+https://git.kernel.org/pub/scm/libs/libtrace/libtracefs.git/
+
+COPYING
+-------
+Copyright \(C) 2021 VMware, Inc. Free use of this software is granted under
+the terms of the GNU Public License (GPL).
diff --git a/include/tracefs-local.h b/include/tracefs-local.h
index 9e3aa18..73698e8 100644
--- a/include/tracefs-local.h
+++ b/include/tracefs-local.h
@@ -30,6 +30,7 @@  struct tracefs_instance {
 	int				ftrace_notrace_fd;
 	int				ftrace_marker_fd;
 	int				ftrace_marker_raw_fd;
+	bool				pipe_keep_going;
 };
 
 extern pthread_mutex_t toplevel_lock;
diff --git a/include/tracefs.h b/include/tracefs.h
index e29b550..3c6741b 100644
--- a/include/tracefs.h
+++ b/include/tracefs.h
@@ -184,4 +184,9 @@  int tracefs_function_notrace(struct tracefs_instance *instance, const char *filt
 /* Control library logs */
 void tracefs_set_loglevel(enum tep_loglevel level);
 
+int tracefs_trace_pipe_stream(int fd, struct tracefs_instance *instance, int flags);
+int tracefs_trace_pipe_print(struct tracefs_instance *instance);
+void tracefs_trace_pipe_stop(struct tracefs_instance *instance);
+
+
 #endif /* _TRACE_FS_H */
diff --git a/src/tracefs-tools.c b/src/tracefs-tools.c
index 0cbb56d..a4fc36d 100644
--- a/src/tracefs-tools.c
+++ b/src/tracefs-tools.c
@@ -912,3 +912,102 @@  int tracefs_function_notrace(struct tracefs_instance *instance, const char *filt
 	tracefs_put_tracing_file(filter_path);
 	return ret;
 }
+
+static bool top_pipe_keep_going;
+
+/**
+ * tracefs_trace_pipe_stream - redirect the stream of trace data to an output
+ * file. The "splice" system call is used to moves the data without copying
+ * between kernel address space and user address space. The user can interrupt
+ * the streaming of the data by pressing Ctrl-c.
+ * @fd: The file descriptor of the output file.
+ * @instance: ftrace instance, can be NULL for top tracing instance.
+ * @flags: flags to be passed to the "splice" system call.
+ *
+ * Returns -1 in case of an error or 0 otherwise.
+ */
+int tracefs_trace_pipe_stream(int fd, struct tracefs_instance *instance,
+			      int flags)
+{
+	bool *keep_going = instance ? &instance->pipe_keep_going :
+				      &top_pipe_keep_going;
+	const char *file = "trace_pipe";
+	int brass[2], in_fd, ret = -1;
+	off_t data_size;
+
+	(*(volatile bool *)keep_going) = true;
+
+	in_fd = tracefs_instance_file_open(instance, file, O_RDONLY);
+	if (in_fd < 0) {
+		tracefs_warning("Failed to open 'trace_pipe'.");
+		return ret;
+	}
+
+	if(pipe(brass) < 0) {
+		tracefs_warning("Failed to open pipe.");
+		goto close_file;
+	}
+
+	data_size = fcntl(brass[0], F_GETPIPE_SZ);
+	if (data_size <= 0) {
+		tracefs_warning("Failed to open pipe (size=0).");
+		goto close_all;
+	}
+
+	while (*(volatile bool *)keep_going) {
+		ret = splice(in_fd, NULL,
+			     brass[1], NULL,
+			     data_size, flags);
+		if (ret < 0)
+			break;
+
+		ret = splice(brass[0], NULL,
+			     fd, NULL,
+			     data_size, flags);
+		if (ret < 0)
+			break;
+	}
+
+	/*
+	 * Do not return error in the case when the "splice" system call
+	 * was interrupted by the user (pressing Ctrl-c).
+	 */
+	if (!keep_going)
+		ret = 0;
+
+ close_all:
+	close(brass[0]);
+	close(brass[1]);
+ close_file:
+	close(in_fd);
+
+	return ret;
+}
+
+/**
+ * tracefs_trace_pipe_print - redirect the stream of trace data to "stdout".
+ * The "splice" system call is used to moves the data without copying
+ * between kernel address space and user address space.
+ * @instance: ftrace instance, can be NULL for top tracing instance.
+ *
+ * Returns -1 in case of an error or 0 otherwise.
+ */
+
+int tracefs_trace_pipe_print(struct tracefs_instance *instance)
+{
+   return tracefs_trace_pipe_stream(STDOUT_FILENO,
+				    instance,
+				    SPLICE_F_MORE | SPLICE_F_MOVE);
+}
+
+/**
+ * tracefs_trace_pipe_stop - stop the streaming of trace data.
+ * @instance: ftrace instance, can be NULL for top tracing instance.
+ */
+void tracefs_trace_pipe_stop(struct tracefs_instance *instance)
+{
+	if (instance)
+		instance->pipe_keep_going = false;
+	else
+		top_pipe_keep_going = false;
+}