diff mbox series

[1/1] tracing: Support reading trace event format file larger than PAGE_SIZE

Message ID 20250102174317.1594-1-shiju.jose@huawei.com (mailing list archive)
State Rejected
Headers show
Series [1/1] tracing: Support reading trace event format file larger than PAGE_SIZE | expand

Commit Message

Shiju Jose Jan. 2, 2025, 5:43 p.m. UTC
From: Shiju Jose <shiju.jose@huawei.com>

When userspace reads a trace event format file, the maximum data size
that can be read is limited to PAGE_SIZE by the seq_read() and
seq_read_iter() functions. This results in userspace receiving partial
data if the format file is larger than PAGE_SIZE, requiring a workaround
to read the complete data from the format file.

Add support for reading trace event format files larger than PAGE_SIZE when
needed by userspace.

Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
---
 kernel/trace/trace_events.c | 32 +++++++++++++++++++++++++++++++-
 1 file changed, 31 insertions(+), 1 deletion(-)

Comments

Steven Rostedt Jan. 6, 2025, 5:22 p.m. UTC | #1
On Thu, 2 Jan 2025 17:43:17 +0000
<shiju.jose@huawei.com> wrote:

> From: Shiju Jose <shiju.jose@huawei.com>
> 
> When userspace reads a trace event format file, the maximum data size
> that can be read is limited to PAGE_SIZE by the seq_read() and
> seq_read_iter() functions. This results in userspace receiving partial
> data if the format file is larger than PAGE_SIZE, requiring a workaround
> to read the complete data from the format file.
> 
> Add support for reading trace event format files larger than PAGE_SIZE when
> needed by userspace.
> 
> Signed-off-by: Shiju Jose <shiju.jose@huawei.com>

How is this an issue? This is common for all pseudo files and can be
handled properly from user space.

Here, with this program:

read.c:
-------------------------8<-------------------------
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/types.h>

int main (int argc, char **argv)
{
	char *buf;
	int fd;
	off_t size;
	int r, s;

	if (argc < 2) {
		printf("usage: %s file\n", argv[0]);
		exit(-1);
	}

	fd = open(argv[1], O_RDONLY);
	if (fd < 0) {
		perror(argv[1]);
		exit(-1);
	}

	size = BUFSIZ * 10;

	buf = malloc(size);
	for (s = 0, r = 1; r > 0; s += r) {
		r = read(fd, buf, size);
		if (r < 0) {
			perror(argv[1]);
			exit(-1);
		}
		printf("Read %d bytes from %s\n", r, argv[1]);
	}
	free(buf);
	close(fd);
	return 0;
}
------------------------->8-------------------------

 $ read /proc/kallsyms
Read 4091 bytes from /proc/kallsyms
Read 4075 bytes from /proc/kallsyms
Read 4078 bytes from /proc/kallsyms
Read 4083 bytes from /proc/kallsyms
Read 4093 bytes from /proc/kallsyms
Read 4076 bytes from /proc/kallsyms
Read 4080 bytes from /proc/kallsyms
Read 4086 bytes from /proc/kallsyms
Read 4080 bytes from /proc/kallsyms
Read 4064 bytes from /proc/kallsyms
Read 4071 bytes from /proc/kallsyms
Read 4063 bytes from /proc/kallsyms
Read 4069 bytes from /proc/kallsyms
Read 4079 bytes from /proc/kallsyms
Read 4063 bytes from /proc/kallsyms
Read 4072 bytes from /proc/kallsyms
Read 4046 bytes from /proc/kallsyms
Read 4091 bytes from /proc/kallsyms
Read 4090 bytes from /proc/kallsyms
Read 4067 bytes from /proc/kallsyms
Read 4080 bytes from /proc/kallsyms
Read 4066 bytes from /proc/kallsyms
Read 4085 bytes from /proc/kallsyms
Read 4095 bytes from /proc/kallsyms
Read 4076 bytes from /proc/kallsyms
Read 4090 bytes from /proc/kallsyms
Read 4066 bytes from /proc/kallsyms
Read 4073 bytes from /proc/kallsyms
Read 4091 bytes from /proc/kallsyms
Read 4075 bytes from /proc/kallsyms
Read 4076 bytes from /proc/kallsyms
Read 4048 bytes from /proc/kallsyms
Read 4074 bytes from /proc/kallsyms
Read 4058 bytes from /proc/kallsyms
Read 4074 bytes from /proc/kallsyms
[..]
Read 4052 bytes from /proc/kallsyms
Read 4061 bytes from /proc/kallsyms
Read 4061 bytes from /proc/kallsyms
Read 4053 bytes from /proc/kallsyms
Read 4083 bytes from /proc/kallsyms
Read 4066 bytes from /proc/kallsyms
Read 4093 bytes from /proc/kallsyms
Read 4072 bytes from /proc/kallsyms
Read 1982 bytes from /proc/kallsyms
Read 0 bytes from /proc/kallsyms

You see, it requires multiple reads to pull in an entire kernel pseudo
file. None of those reads are greater than PAGE_SIZE. Why should trace
format files be any different?

libtracefs handles this perfectly fine:

  https://git.kernel.org/pub/scm/libs/libtrace/libtracefs.git/tree/src/tracefs-utils.c#n343

Looks like you are trying to change the kernel to fix a user space bug :-/

 NAK!

-- Steve
Shiju Jose Jan. 6, 2025, 6:15 p.m. UTC | #2
>-----Original Message-----
>From: Steven Rostedt <rostedt@goodmis.org>
>Sent: 06 January 2025 17:22
>To: Shiju Jose <shiju.jose@huawei.com>
>Cc: mhiramat@kernel.org; mathieu.desnoyers@efficios.com; linux-trace-
>kernel@vger.kernel.org; linux-kernel@vger.kernel.org; Linuxarm
><linuxarm@huawei.com>; Jonathan Cameron
><jonathan.cameron@huawei.com>; tanxiaofei <tanxiaofei@huawei.com>;
>Zengtao (B) <prime.zeng@hisilicon.com>
>Subject: Re: [PATCH 1/1] tracing: Support reading trace event format file larger
>than PAGE_SIZE
>
>On Thu, 2 Jan 2025 17:43:17 +0000
><shiju.jose@huawei.com> wrote:
>
>> From: Shiju Jose <shiju.jose@huawei.com>
>>
>> When userspace reads a trace event format file, the maximum data size
>> that can be read is limited to PAGE_SIZE by the seq_read() and
>> seq_read_iter() functions. This results in userspace receiving partial
>> data if the format file is larger than PAGE_SIZE, requiring a
>> workaround to read the complete data from the format file.
>>
>> Add support for reading trace event format files larger than PAGE_SIZE
>> when needed by userspace.
>>
>> Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
>
>How is this an issue? This is common for all pseudo files and can be handled
>properly from user space.
>
>Here, with this program:
>
>read.c:
>-------------------------8<-------------------------
>#include <stdlib.h>
>#include <stdio.h>
>#include <unistd.h>
>#include <fcntl.h>
>#include <sys/types.h>
>
>int main (int argc, char **argv)
>{
>	char *buf;
>	int fd;
>	off_t size;
>	int r, s;
>
>	if (argc < 2) {
>		printf("usage: %s file\n", argv[0]);
>		exit(-1);
>	}
>
>	fd = open(argv[1], O_RDONLY);
>	if (fd < 0) {
>		perror(argv[1]);
>		exit(-1);
>	}
>
>	size = BUFSIZ * 10;
>
>	buf = malloc(size);
>	for (s = 0, r = 1; r > 0; s += r) {
>		r = read(fd, buf, size);
>		if (r < 0) {
>			perror(argv[1]);
>			exit(-1);
>		}
>		printf("Read %d bytes from %s\n", r, argv[1]);
>	}
>	free(buf);
>	close(fd);
>	return 0;
>}
>------------------------->8-------------------------
>
> $ read /proc/kallsyms
>Read 4091 bytes from /proc/kallsyms
>Read 4075 bytes from /proc/kallsyms
>Read 4078 bytes from /proc/kallsyms
>Read 4083 bytes from /proc/kallsyms
>Read 4093 bytes from /proc/kallsyms
>Read 4076 bytes from /proc/kallsyms
>Read 4080 bytes from /proc/kallsyms
>Read 4086 bytes from /proc/kallsyms
>Read 4080 bytes from /proc/kallsyms
>Read 4064 bytes from /proc/kallsyms
>Read 4071 bytes from /proc/kallsyms
>Read 4063 bytes from /proc/kallsyms
>Read 4069 bytes from /proc/kallsyms
>Read 4079 bytes from /proc/kallsyms
>Read 4063 bytes from /proc/kallsyms
>Read 4072 bytes from /proc/kallsyms
>Read 4046 bytes from /proc/kallsyms
>Read 4091 bytes from /proc/kallsyms
>Read 4090 bytes from /proc/kallsyms
>Read 4067 bytes from /proc/kallsyms
>Read 4080 bytes from /proc/kallsyms
>Read 4066 bytes from /proc/kallsyms
>Read 4085 bytes from /proc/kallsyms
>Read 4095 bytes from /proc/kallsyms
>Read 4076 bytes from /proc/kallsyms
>Read 4090 bytes from /proc/kallsyms
>Read 4066 bytes from /proc/kallsyms
>Read 4073 bytes from /proc/kallsyms
>Read 4091 bytes from /proc/kallsyms
>Read 4075 bytes from /proc/kallsyms
>Read 4076 bytes from /proc/kallsyms
>Read 4048 bytes from /proc/kallsyms
>Read 4074 bytes from /proc/kallsyms
>Read 4058 bytes from /proc/kallsyms
>Read 4074 bytes from /proc/kallsyms
>[..]
>Read 4052 bytes from /proc/kallsyms
>Read 4061 bytes from /proc/kallsyms
>Read 4061 bytes from /proc/kallsyms
>Read 4053 bytes from /proc/kallsyms
>Read 4083 bytes from /proc/kallsyms
>Read 4066 bytes from /proc/kallsyms
>Read 4093 bytes from /proc/kallsyms
>Read 4072 bytes from /proc/kallsyms
>Read 1982 bytes from /proc/kallsyms
>Read 0 bytes from /proc/kallsyms
>
>You see, it requires multiple reads to pull in an entire kernel pseudo file. None of
>those reads are greater than PAGE_SIZE. Why should trace format files be any
>different?
Thanks for the reply.
Yes. I had a fix/workaround in the userspace rasdaemon with multiple reads like above
as reported previously in the  following thread.  
https://lore.kernel.org/lkml/3c9808a694d242cab35bab67602edebf@huawei.com/
However thought a solution in the common kernel code for the format file
may be better.  I will go ahead with the user space solution.
 
Also shared an information in the above thread about libtraceevent  __parse_event() does not
return error when parse_format() fail with incomplete formt data, which resulted initialization for the
trace event does not fail in the user space tool.
>libtracefs handles this perfectly fine:
>
>  https://git.kernel.org/pub/scm/libs/libtrace/libtracefs.git/tree/src/tracefs-
>utils.c#n343
>
>Looks like you are trying to change the kernel to fix a user space bug :-/
>
> NAK!
>
>-- Steve

Thanks,
Shiju
Steven Rostedt Jan. 6, 2025, 10:11 p.m. UTC | #3
On Mon, 6 Jan 2025 18:15:36 +0000
Shiju Jose <shiju.jose@huawei.com> wrote:

> >You see, it requires multiple reads to pull in an entire kernel pseudo file. None of
> >those reads are greater than PAGE_SIZE. Why should trace format files be any
> >different?  
> Thanks for the reply.
> Yes. I had a fix/workaround in the userspace rasdaemon with multiple reads like above
> as reported previously in the  following thread.  
> https://lore.kernel.org/lkml/3c9808a694d242cab35bab67602edebf@huawei.com/
> However thought a solution in the common kernel code for the format file
> may be better.  I will go ahead with the user space solution.

That would make it different than every other pseudo file in the kernel. I
rather not do that.

>  
> Also shared an information in the above thread about libtraceevent  __parse_event() does not
> return error when parse_format() fail with incomplete formt data, which resulted initialization for the
> trace event does not fail in the user space tool.

Hmm, can you send me the the format file that failed?

Just the output of "cat /sys/kernel/tracing/events/<system>/<event>/format"
will do.

-- Steve
Shiju Jose Jan. 7, 2025, 11:04 a.m. UTC | #4
>-----Original Message-----
>From: Steven Rostedt <rostedt@goodmis.org>
>Sent: 06 January 2025 22:12
>To: Shiju Jose <shiju.jose@huawei.com>
>Cc: mhiramat@kernel.org; mathieu.desnoyers@efficios.com; linux-trace-
>kernel@vger.kernel.org; linux-kernel@vger.kernel.org; Linuxarm
><linuxarm@huawei.com>; Jonathan Cameron
><jonathan.cameron@huawei.com>; tanxiaofei <tanxiaofei@huawei.com>;
>Zengtao (B) <prime.zeng@hisilicon.com>
>Subject: Re: [PATCH 1/1] tracing: Support reading trace event format file larger
>than PAGE_SIZE
>
>On Mon, 6 Jan 2025 18:15:36 +0000
>Shiju Jose <shiju.jose@huawei.com> wrote:
>
>> >You see, it requires multiple reads to pull in an entire kernel
>> >pseudo file. None of those reads are greater than PAGE_SIZE. Why
>> >should trace format files be any different?
>> Thanks for the reply.
>> Yes. I had a fix/workaround in the userspace rasdaemon with multiple
>> reads like above as reported previously in the  following thread.
>> https://lore.kernel.org/lkml/3c9808a694d242cab35bab67602edebf@huawei.c
>> om/ However thought a solution in the common kernel code for the
>> format file may be better.  I will go ahead with the user space
>> solution.
>
>That would make it different than every other pseudo file in the kernel. I rather
>not do that.
>
>>
>> Also shared an information in the above thread about libtraceevent
>> __parse_event() does not return error when parse_format() fail with
>> incomplete formt data, which resulted initialization for the trace event does
>not fail in the user space tool.
>
>Hmm, can you send me the the format file that failed?
>
>Just the output of "cat /sys/kernel/tracing/events/<system>/<event>/format"
>will do.

Please find attached format file, which might have failed to parse because of the
unsupported formats in the libtraceevent you mentioned in the following thread.
https://lore.kernel.org/lkml/20241127104132.6c1729e1@gandalf.local.home/#t    

Please find the improvement in the libraceevent which I mentioned for not returning error
when parsing failed,
===========================================
diff --git a/src/event-parse.c b/src/event-parse.c index 0427061..b9264cb 100644
--- a/src/event-parse.c
+++ b/src/event-parse.c
@@ -7905,9 +7905,14 @@ __parse_event(struct tep_handle *tep,
              const char *buf, unsigned long size,
              const char *sys)
 {
-       int ret = parse_format(eventp, tep, buf, size, sys);
-       struct tep_event *event = *eventp;
+       int ret;
+       struct tep_event *event;
+
+       ret = parse_format(eventp, tep, buf, size, sys);
+       if (ret)
+               return ret;
 
+       event = *eventp;
        if (event == NULL)
                return ret; 
 =========================================  
>
>-- Steve

Thanks,
Shiju
Steven Rostedt Jan. 7, 2025, 11:02 p.m. UTC | #5
On Tue, 7 Jan 2025 11:04:57 +0000
Shiju Jose <shiju.jose@huawei.com> wrote:

> Please find attached format file, which might have failed to parse because of the
> unsupported formats in the libtraceevent you mentioned in the following thread.
> https://lore.kernel.org/lkml/20241127104132.6c1729e1@gandalf.local.home/#t    
> 
> Please find the improvement in the libraceevent which I mentioned for not returning error
> when parsing failed,
> ===========================================
> diff --git a/src/event-parse.c b/src/event-parse.c index 0427061..b9264cb 100644
> --- a/src/event-parse.c
> +++ b/src/event-parse.c
> @@ -7905,9 +7905,14 @@ __parse_event(struct tep_handle *tep,
>               const char *buf, unsigned long size,
>               const char *sys)
>  {
> -       int ret = parse_format(eventp, tep, buf, size, sys);
> -       struct tep_event *event = *eventp;
> +       int ret;
> +       struct tep_event *event;
> +
> +       ret = parse_format(eventp, tep, buf, size, sys);
> +       if (ret)
> +               return ret;
>  
> +       event = *eventp;
>         if (event == NULL)
>                 return ret; 

Actually, this is the proper patch:

diff --git a/src/event-parse.c b/src/event-parse.c
index 33ed7fb47fff..f2e50b0e8992 100644
--- a/src/event-parse.c
+++ b/src/event-parse.c
@@ -7841,7 +7841,7 @@ static enum tep_errno parse_format(struct tep_event **eventp,
 	ret = event_read_format(event);
 	if (ret < 0) {
 		ret = TEP_ERRNO__READ_FORMAT_FAILED;
-		goto event_parse_failed;
+		goto event_alloc_failed;
 	}
 
 	/*


As it's OK if it returns that it failed to parse the print format, as it
can still read the event. But if it fails to read the fields, then it is
basically useless.

-- Steve
Steven Rostedt Jan. 7, 2025, 11:19 p.m. UTC | #6
On Tue, 7 Jan 2025 18:02:24 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:

> diff --git a/src/event-parse.c b/src/event-parse.c
> index 33ed7fb47fff..f2e50b0e8992 100644
> --- a/src/event-parse.c
> +++ b/src/event-parse.c
> @@ -7841,7 +7841,7 @@ static enum tep_errno parse_format(struct tep_event **eventp,
>  	ret = event_read_format(event);
>  	if (ret < 0) {
>  		ret = TEP_ERRNO__READ_FORMAT_FAILED;
> -		goto event_parse_failed;
> +		goto event_alloc_failed;
>  	}
>  
>  	/*
> 
> 
> As it's OK if it returns that it failed to parse the print format, as it
> can still read the event. But if it fails to read the fields, then it is
> basically useless.

And the event you posted would still succeed. And it should succeed
actually. It just failed to parse the print format. There's a lot of events
that do as the print format is pretty much full C code, and libtraceveent
is not a full C parser. It can also call functions that it doesn't know
about which means even if it were a full C parser it would still fail to
parse.

But it does flag the event as "TEP_EVENT_FL_FAILED", and should display
"[FAILED_TO_PARSE]" when printing the generic output.

I still think failing to parse the fields is more of a critical error
though. That shouldn't happen, and if it does, it should be corrected.

-- Steve
Shiju Jose Jan. 8, 2025, 10:19 a.m. UTC | #7
>-----Original Message-----
>From: Steven Rostedt <rostedt@goodmis.org>
>Sent: 07 January 2025 23:02
>To: Shiju Jose <shiju.jose@huawei.com>
>Cc: mhiramat@kernel.org; mathieu.desnoyers@efficios.com; linux-trace-
>kernel@vger.kernel.org; linux-kernel@vger.kernel.org; Linuxarm
><linuxarm@huawei.com>; Jonathan Cameron
><jonathan.cameron@huawei.com>; tanxiaofei <tanxiaofei@huawei.com>;
>Zengtao (B) <prime.zeng@hisilicon.com>
>Subject: Re: [PATCH 1/1] tracing: Support reading trace event format file larger
>than PAGE_SIZE
>
>On Tue, 7 Jan 2025 11:04:57 +0000
>Shiju Jose <shiju.jose@huawei.com> wrote:
>
>> Please find attached format file, which might have failed to parse
>> because of the unsupported formats in the libtraceevent you mentioned in the
>following thread.
>>
>https://lore.kernel.org/lkml/20241127104132.6c1729e1@gandalf.local.home/#t
>>
>> Please find the improvement in the libraceevent which I mentioned for
>> not returning error when parsing failed,
>> ===========================================
>> diff --git a/src/event-parse.c b/src/event-parse.c index
>> 0427061..b9264cb 100644
>> --- a/src/event-parse.c
>> +++ b/src/event-parse.c
>> @@ -7905,9 +7905,14 @@ __parse_event(struct tep_handle *tep,
>>               const char *buf, unsigned long size,
>>               const char *sys)
>>  {
>> -       int ret = parse_format(eventp, tep, buf, size, sys);
>> -       struct tep_event *event = *eventp;
>> +       int ret;
>> +       struct tep_event *event;
>> +
>> +       ret = parse_format(eventp, tep, buf, size, sys);
>> +       if (ret)
>> +               return ret;
>>
>> +       event = *eventp;
>>         if (event == NULL)
>>                 return ret;
>
>Actually, this is the proper patch:
>
>diff --git a/src/event-parse.c b/src/event-parse.c index
>33ed7fb47fff..f2e50b0e8992 100644
>--- a/src/event-parse.c
>+++ b/src/event-parse.c
>@@ -7841,7 +7841,7 @@ static enum tep_errno parse_format(struct tep_event
>**eventp,
> 	ret = event_read_format(event);
> 	if (ret < 0) {
> 		ret = TEP_ERRNO__READ_FORMAT_FAILED;
>-		goto event_parse_failed;
>+		goto event_alloc_failed;
> 	}
>
> 	/*
>
>
>As it's OK if it returns that it failed to parse the print format, as it can still read
>the event. But if it fails to read the fields, then it is basically useless. 
Thanks.
Your patch worked fine for returning error to the user space tool for the incomplete format data case as
the following check become true in the __parse_event()  after freeing 'event' in the goto event_alloc_failed.
 if (event == NULL)
     return ret;
>
>-- Steve

Thanks,
Shiju
Shiju Jose Jan. 8, 2025, 10:34 a.m. UTC | #8
>-----Original Message-----
>From: Steven Rostedt <rostedt@goodmis.org>
>Sent: 07 January 2025 23:19
>To: Shiju Jose <shiju.jose@huawei.com>
>Cc: mhiramat@kernel.org; mathieu.desnoyers@efficios.com; linux-trace-
>kernel@vger.kernel.org; linux-kernel@vger.kernel.org; Linuxarm
><linuxarm@huawei.com>; Jonathan Cameron
><jonathan.cameron@huawei.com>; tanxiaofei <tanxiaofei@huawei.com>;
>Zengtao (B) <prime.zeng@hisilicon.com>
>Subject: Re: [PATCH 1/1] tracing: Support reading trace event format file larger
>than PAGE_SIZE
>
>On Tue, 7 Jan 2025 18:02:24 -0500
>Steven Rostedt <rostedt@goodmis.org> wrote:
>
>> diff --git a/src/event-parse.c b/src/event-parse.c index
>> 33ed7fb47fff..f2e50b0e8992 100644
>> --- a/src/event-parse.c
>> +++ b/src/event-parse.c
>> @@ -7841,7 +7841,7 @@ static enum tep_errno parse_format(struct
>tep_event **eventp,
>>  	ret = event_read_format(event);
>>  	if (ret < 0) {
>>  		ret = TEP_ERRNO__READ_FORMAT_FAILED;
>> -		goto event_parse_failed;
>> +		goto event_alloc_failed;
>>  	}
>>
>>  	/*
>>
>>
>> As it's OK if it returns that it failed to parse the print format, as
>> it can still read the event. But if it fails to read the fields, then
>> it is basically useless.
>
>And the event you posted would still succeed. And it should succeed actually. It
>just failed to parse the print format. There's a lot of events that do as the print
>format is pretty much full C code, and libtraceveent is not a full C parser. It can
>also call functions that it doesn't know about which means even if it were a full C
>parser it would still fail to parse.
>
>But it does flag the event as "TEP_EVENT_FL_FAILED", and should display
>"[FAILED_TO_PARSE]" when printing the generic output.
>
Ok.
>I still think failing to parse the fields is more of a critical error though. That
>shouldn't happen, and if it does, it should be corrected.

As reported previously, parsing fields succeed in the rasdaemon for CXL events, when format file is larger than
PAGE_SIZE, with the rasdaemon change for reading complete format file.

>
>-- Steve

Thanks,
Shiju
diff mbox series

Patch

diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index 1545cc8b49d0..ef33614e7f22 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -1801,6 +1801,36 @@  static int trace_format_open(struct inode *inode, struct file *file)
 	return 0;
 }
 
+/**
+ *      trace_format_read - read() method for format file.
+ *      @file: the file to read from
+ *      @buf: the buffer to read to
+ *      @size: the maximum number of bytes to read
+ *      @ppos: the current position in the file
+ *
+ *  * Return:
+ *  * %0  - No bytes copied (EOF).
+ *  * %>0 - Number of bytes copied.
+ *  * %<0 - Error code.
+ */
+static ssize_t trace_format_read(struct file *file, char __user *buf, size_t size, loff_t *ppos)
+{
+	ssize_t copied = 0;
+	ssize_t ret;
+
+	do {
+		ret = seq_read(file, buf + copied, size - copied, ppos);
+		if (ret < 0)
+			return ret;
+
+		copied += ret;
+		if (copied >= size)
+			break;
+	} while (ret);
+
+	return copied;
+}
+
 #ifdef CONFIG_PERF_EVENTS
 static ssize_t
 event_id_read(struct file *filp, char __user *ubuf, size_t cnt, loff_t *ppos)
@@ -2284,7 +2314,7 @@  static const struct file_operations ftrace_enable_fops = {
 
 static const struct file_operations ftrace_event_format_fops = {
 	.open = trace_format_open,
-	.read = seq_read,
+	.read = trace_format_read,
 	.llseek = seq_lseek,
 	.release = seq_release,
 };