Message ID | 20211120112738.45980-8-laoar.shao@gmail.com (mailing list archive) |
---|---|
State | Not Applicable |
Delegated to: | BPF |
Headers | show |
Series | task comm cleanups | expand |
Context | Check | Description |
---|---|---|
bpf/vmtest-bpf-next | pending | VM_Test |
bpf/vmtest-bpf-next-PR | pending | PR summary |
bpf/vmtest-bpf | fail | VM_Test |
bpf/vmtest-bpf-PR | fail | PR summary |
netdev/tree_selection | success | Not a local patch, async |
Hi, Yafang Shao <laoar.shao@gmail.com> writes: > As the sched:sched_switch tracepoint args are derived from the kernel, > we'd better make it same with the kernel. So the macro TASK_COMM_LEN is > converted to type enum, then all the BPF programs can get it through BTF. > > The BPF program which wants to use TASK_COMM_LEN should include the header > vmlinux.h. Regarding the test_stacktrace_map and test_tracepoint, as the > type defined in linux/bpf.h are also defined in vmlinux.h, so we don't > need to include linux/bpf.h again. > > Signed-off-by: Yafang Shao <laoar.shao@gmail.com> > Acked-by: Andrii Nakryiko <andrii@kernel.org> > Acked-by: David Hildenbrand <david@redhat.com> > Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> > Cc: Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com> > Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com> > Cc: Michal Miroslaw <mirq-linux@rere.qmqm.pl> > Cc: Peter Zijlstra <peterz@infradead.org> > Cc: Steven Rostedt <rostedt@goodmis.org> > Cc: Matthew Wilcox <willy@infradead.org> > Cc: David Hildenbrand <david@redhat.com> > Cc: Al Viro <viro@zeniv.linux.org.uk> > Cc: Kees Cook <keescook@chromium.org> > Cc: Petr Mladek <pmladek@suse.com> > --- > include/linux/sched.h | 9 +++++++-- > tools/testing/selftests/bpf/progs/test_stacktrace_map.c | 6 +++--- > tools/testing/selftests/bpf/progs/test_tracepoint.c | 6 +++--- > 3 files changed, 13 insertions(+), 8 deletions(-) > > diff --git a/include/linux/sched.h b/include/linux/sched.h > index 78c351e35fec..cecd4806edc6 100644 > --- a/include/linux/sched.h > +++ b/include/linux/sched.h > @@ -274,8 +274,13 @@ struct task_group; > > #define get_current_state() READ_ONCE(current->__state) > > -/* Task command name length: */ > -#define TASK_COMM_LEN 16 > +/* > + * Define the task command name length as enum, then it can be visible to > + * BPF programs. > + */ > +enum { > + TASK_COMM_LEN = 16, > +}; This breaks the trigger-field-variable-support.tc from the ftrace test suite at least on s390: echo 'hist:keys=next_comm:wakeup_lat=common_timestamp.usecs-$ts0:onmatch(sched.sched_waking).wakeup_latency($wakeup_lat,next_pid,sched.sched_waking.prio,next_comm) if next_comm=="ping"' linux/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-field-variable-support.tc: line 15: echo: write error: Invalid argument I added a debugging line into check_synth_field(): [ 44.091037] field->size 16, hist_field->size 16, field->is_signed 1, hist_field->is_signed 0 Note the difference in the signed field. Regards Sven
On Mon, Nov 29, 2021 at 6:13 PM Sven Schnelle <svens@linux.ibm.com> wrote: > > Hi, > > Yafang Shao <laoar.shao@gmail.com> writes: > > > As the sched:sched_switch tracepoint args are derived from the kernel, > > we'd better make it same with the kernel. So the macro TASK_COMM_LEN is > > converted to type enum, then all the BPF programs can get it through BTF. > > > > The BPF program which wants to use TASK_COMM_LEN should include the header > > vmlinux.h. Regarding the test_stacktrace_map and test_tracepoint, as the > > type defined in linux/bpf.h are also defined in vmlinux.h, so we don't > > need to include linux/bpf.h again. > > > > Signed-off-by: Yafang Shao <laoar.shao@gmail.com> > > Acked-by: Andrii Nakryiko <andrii@kernel.org> > > Acked-by: David Hildenbrand <david@redhat.com> > > Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> > > Cc: Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com> > > Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com> > > Cc: Michal Miroslaw <mirq-linux@rere.qmqm.pl> > > Cc: Peter Zijlstra <peterz@infradead.org> > > Cc: Steven Rostedt <rostedt@goodmis.org> > > Cc: Matthew Wilcox <willy@infradead.org> > > Cc: David Hildenbrand <david@redhat.com> > > Cc: Al Viro <viro@zeniv.linux.org.uk> > > Cc: Kees Cook <keescook@chromium.org> > > Cc: Petr Mladek <pmladek@suse.com> > > --- > > include/linux/sched.h | 9 +++++++-- > > tools/testing/selftests/bpf/progs/test_stacktrace_map.c | 6 +++--- > > tools/testing/selftests/bpf/progs/test_tracepoint.c | 6 +++--- > > 3 files changed, 13 insertions(+), 8 deletions(-) > > > > diff --git a/include/linux/sched.h b/include/linux/sched.h > > index 78c351e35fec..cecd4806edc6 100644 > > --- a/include/linux/sched.h > > +++ b/include/linux/sched.h > > @@ -274,8 +274,13 @@ struct task_group; > > > > #define get_current_state() READ_ONCE(current->__state) > > > > -/* Task command name length: */ > > -#define TASK_COMM_LEN 16 > > +/* > > + * Define the task command name length as enum, then it can be visible to > > + * BPF programs. > > + */ > > +enum { > > + TASK_COMM_LEN = 16, > > +}; > > This breaks the trigger-field-variable-support.tc from the ftrace test > suite at least on s390: > > echo 'hist:keys=next_comm:wakeup_lat=common_timestamp.usecs-$ts0:onmatch(sched.sched_waking).wakeup_latency($wakeup_lat,next_pid,sched.sched_waking.prio,next_comm) if next_comm=="ping"' > linux/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-field-variable-support.tc: line 15: echo: write error: Invalid argument > > I added a debugging line into check_synth_field(): > > [ 44.091037] field->size 16, hist_field->size 16, field->is_signed 1, hist_field->is_signed 0 > > Note the difference in the signed field. > Hi Sven, Thanks for the report and debugging! Seems we should explicitly define it as signed ? Could you pls. help verify it? diff --git a/include/linux/sched.h b/include/linux/sched.h index cecd4806edc6..44d36c6af3e1 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -278,7 +278,7 @@ struct task_group; * Define the task command name length as enum, then it can be visible to * BPF programs. */ -enum { +enum SignedEnum { TASK_COMM_LEN = 16, };
Hi, Yafang Shao <laoar.shao@gmail.com> writes: > On Mon, Nov 29, 2021 at 6:13 PM Sven Schnelle <svens@linux.ibm.com> wrote: >> > diff --git a/include/linux/sched.h b/include/linux/sched.h >> > index 78c351e35fec..cecd4806edc6 100644 >> > --- a/include/linux/sched.h >> > +++ b/include/linux/sched.h >> > @@ -274,8 +274,13 @@ struct task_group; >> > >> > #define get_current_state() READ_ONCE(current->__state) >> > >> > -/* Task command name length: */ >> > -#define TASK_COMM_LEN 16 >> > +/* >> > + * Define the task command name length as enum, then it can be visible to >> > + * BPF programs. >> > + */ >> > +enum { >> > + TASK_COMM_LEN = 16, >> > +}; >> >> This breaks the trigger-field-variable-support.tc from the ftrace test >> suite at least on s390: >> >> echo >> 'hist:keys=next_comm:wakeup_lat=common_timestamp.usecs-$ts0:onmatch(sched.sched_waking).wakeup_latency($wakeup_lat,next_pid,sched.sched_waking.prio,next_comm) >> if next_comm=="ping"' >> linux/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-field-variable-support.tc: line 15: echo: write error: Invalid argument >> >> I added a debugging line into check_synth_field(): >> >> [ 44.091037] field->size 16, hist_field->size 16, field->is_signed 1, hist_field->is_signed 0 >> >> Note the difference in the signed field. >> > > Hi Sven, > > Thanks for the report and debugging! > Seems we should explicitly define it as signed ? > Could you pls. help verify it? > > diff --git a/include/linux/sched.h b/include/linux/sched.h > index cecd4806edc6..44d36c6af3e1 100644 > --- a/include/linux/sched.h > +++ b/include/linux/sched.h > @@ -278,7 +278,7 @@ struct task_group; > * Define the task command name length as enum, then it can be visible to > * BPF programs. > */ > -enum { > +enum SignedEnum { > TASK_COMM_LEN = 16, > }; Umm no. What you're doing here is to define the name of the enum as 'SignedEnum'. This doesn't change the type. I think before C++0x you couldn't force an enum type. Regards Sven
On 29.11.21 15:21, Sven Schnelle wrote: > Hi, > > Yafang Shao <laoar.shao@gmail.com> writes: > >> On Mon, Nov 29, 2021 at 6:13 PM Sven Schnelle <svens@linux.ibm.com> wrote: >>>> diff --git a/include/linux/sched.h b/include/linux/sched.h >>>> index 78c351e35fec..cecd4806edc6 100644 >>>> --- a/include/linux/sched.h >>>> +++ b/include/linux/sched.h >>>> @@ -274,8 +274,13 @@ struct task_group; >>>> >>>> #define get_current_state() READ_ONCE(current->__state) >>>> >>>> -/* Task command name length: */ >>>> -#define TASK_COMM_LEN 16 >>>> +/* >>>> + * Define the task command name length as enum, then it can be visible to >>>> + * BPF programs. >>>> + */ >>>> +enum { >>>> + TASK_COMM_LEN = 16, >>>> +}; >>> >>> This breaks the trigger-field-variable-support.tc from the ftrace test >>> suite at least on s390: >>> >>> echo >>> 'hist:keys=next_comm:wakeup_lat=common_timestamp.usecs-$ts0:onmatch(sched.sched_waking).wakeup_latency($wakeup_lat,next_pid,sched.sched_waking.prio,next_comm) >>> if next_comm=="ping"' >>> linux/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-field-variable-support.tc: line 15: echo: write error: Invalid argument >>> >>> I added a debugging line into check_synth_field(): >>> >>> [ 44.091037] field->size 16, hist_field->size 16, field->is_signed 1, hist_field->is_signed 0 >>> >>> Note the difference in the signed field. >>> >> >> Hi Sven, >> >> Thanks for the report and debugging! >> Seems we should explicitly define it as signed ? >> Could you pls. help verify it? >> >> diff --git a/include/linux/sched.h b/include/linux/sched.h >> index cecd4806edc6..44d36c6af3e1 100644 >> --- a/include/linux/sched.h >> +++ b/include/linux/sched.h >> @@ -278,7 +278,7 @@ struct task_group; >> * Define the task command name length as enum, then it can be visible to >> * BPF programs. >> */ >> -enum { >> +enum SignedEnum { >> TASK_COMM_LEN = 16, >> }; > > Umm no. What you're doing here is to define the name of the enum as > 'SignedEnum'. This doesn't change the type. I think before C++0x you > couldn't force an enum type. I think there are only some "hacks" to modify the type with GCC. For example, with "__attribute__((packed))" we can instruct GCC to use the smallest type possible for the defined enum values. I think with some fake entries one can eventually instruct GCC to use an unsigned type in some cases: https://stackoverflow.com/questions/14635833/is-there-a-way-to-make-an-enum-unsigned-in-the-c90-standard-misra-c-2004-compl enum { TASK_COMM_LEN = 16, TASK_FORCE_UNSIGNED = 0x80000000, }; Haven't tested it, though, and I'm not sure if we should really do that ... :)
Hi, David Hildenbrand <david@redhat.com> writes: > On 29.11.21 15:21, Sven Schnelle wrote: >> Yafang Shao <laoar.shao@gmail.com> writes: >>> Thanks for the report and debugging! >>> Seems we should explicitly define it as signed ? >>> Could you pls. help verify it? >>> >>> diff --git a/include/linux/sched.h b/include/linux/sched.h >>> index cecd4806edc6..44d36c6af3e1 100644 >>> --- a/include/linux/sched.h >>> +++ b/include/linux/sched.h >>> @@ -278,7 +278,7 @@ struct task_group; >>> * Define the task command name length as enum, then it can be visible to >>> * BPF programs. >>> */ >>> -enum { >>> +enum SignedEnum { >>> TASK_COMM_LEN = 16, >>> }; >> >> Umm no. What you're doing here is to define the name of the enum as >> 'SignedEnum'. This doesn't change the type. I think before C++0x you >> couldn't force an enum type. > > I think there are only some "hacks" to modify the type with GCC. For > example, with "__attribute__((packed))" we can instruct GCC to use the > smallest type possible for the defined enum values. Yes, i meant no way that the standard defines. You could force it to signed by having a negative member. > I think with some fake entries one can eventually instruct GCC to use an > unsigned type in some cases: > > https://stackoverflow.com/questions/14635833/is-there-a-way-to-make-an-enum-unsigned-in-the-c90-standard-misra-c-2004-compl > > enum { > TASK_COMM_LEN = 16, > TASK_FORCE_UNSIGNED = 0x80000000, > }; > > Haven't tested it, though, and I'm not sure if we should really do that > ... :) TBH, i would vote for reverting the change. defining an array size as enum feels really odd. Regards Sven
On Mon, Nov 29, 2021 at 10:21 PM Sven Schnelle <svens@linux.ibm.com> wrote: > > Hi, > > Yafang Shao <laoar.shao@gmail.com> writes: > > > On Mon, Nov 29, 2021 at 6:13 PM Sven Schnelle <svens@linux.ibm.com> wrote: > >> > diff --git a/include/linux/sched.h b/include/linux/sched.h > >> > index 78c351e35fec..cecd4806edc6 100644 > >> > --- a/include/linux/sched.h > >> > +++ b/include/linux/sched.h > >> > @@ -274,8 +274,13 @@ struct task_group; > >> > > >> > #define get_current_state() READ_ONCE(current->__state) > >> > > >> > -/* Task command name length: */ > >> > -#define TASK_COMM_LEN 16 > >> > +/* > >> > + * Define the task command name length as enum, then it can be visible to > >> > + * BPF programs. > >> > + */ > >> > +enum { > >> > + TASK_COMM_LEN = 16, > >> > +}; > >> > >> This breaks the trigger-field-variable-support.tc from the ftrace test > >> suite at least on s390: > >> > >> echo > >> 'hist:keys=next_comm:wakeup_lat=common_timestamp.usecs-$ts0:onmatch(sched.sched_waking).wakeup_latency($wakeup_lat,next_pid,sched.sched_waking.prio,next_comm) > >> if next_comm=="ping"' > >> linux/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-field-variable-support.tc: line 15: echo: write error: Invalid argument > >> > >> I added a debugging line into check_synth_field(): > >> > >> [ 44.091037] field->size 16, hist_field->size 16, field->is_signed 1, hist_field->is_signed 0 > >> > >> Note the difference in the signed field. > >> > > > > Hi Sven, > > > > Thanks for the report and debugging! > > Seems we should explicitly define it as signed ? > > Could you pls. help verify it? > > > > diff --git a/include/linux/sched.h b/include/linux/sched.h > > index cecd4806edc6..44d36c6af3e1 100644 > > --- a/include/linux/sched.h > > +++ b/include/linux/sched.h > > @@ -278,7 +278,7 @@ struct task_group; > > * Define the task command name length as enum, then it can be visible to > > * BPF programs. > > */ > > -enum { > > +enum SignedEnum { > > TASK_COMM_LEN = 16, > > }; > > Umm no. What you're doing here is to define the name of the enum as > 'SignedEnum'. This doesn't change the type. I think before C++0x you > couldn't force an enum type. > Ah, I made a stupid mistake....
On Mon, Nov 29, 2021 at 10:38 PM Sven Schnelle <svens@linux.ibm.com> wrote: > > Hi, > > David Hildenbrand <david@redhat.com> writes: > > On 29.11.21 15:21, Sven Schnelle wrote: > >> Yafang Shao <laoar.shao@gmail.com> writes: > >>> Thanks for the report and debugging! > >>> Seems we should explicitly define it as signed ? > >>> Could you pls. help verify it? > >>> > >>> diff --git a/include/linux/sched.h b/include/linux/sched.h > >>> index cecd4806edc6..44d36c6af3e1 100644 > >>> --- a/include/linux/sched.h > >>> +++ b/include/linux/sched.h > >>> @@ -278,7 +278,7 @@ struct task_group; > >>> * Define the task command name length as enum, then it can be visible to > >>> * BPF programs. > >>> */ > >>> -enum { > >>> +enum SignedEnum { > >>> TASK_COMM_LEN = 16, > >>> }; > >> > >> Umm no. What you're doing here is to define the name of the enum as > >> 'SignedEnum'. This doesn't change the type. I think before C++0x you > >> couldn't force an enum type. > > > > I think there are only some "hacks" to modify the type with GCC. For > > example, with "__attribute__((packed))" we can instruct GCC to use the > > smallest type possible for the defined enum values. > > Yes, i meant no way that the standard defines. You could force it to > signed by having a negative member. > > > I think with some fake entries one can eventually instruct GCC to use an > > unsigned type in some cases: > > > > https://stackoverflow.com/questions/14635833/is-there-a-way-to-make-an-enum-unsigned-in-the-c90-standard-misra-c-2004-compl > > > > enum { > > TASK_COMM_LEN = 16, > > TASK_FORCE_UNSIGNED = 0x80000000, > > }; > > > > Haven't tested it, though, and I'm not sure if we should really do that > > ... :) > > TBH, i would vote for reverting the change. defining an array size as > enum feels really odd. > We changed it to enum because the BTF can't parse macro while it can parse the enum type. Anyway I don't insist on keeping this change if you think reverting it is better. Andrew, would you pls. help drop this patch from the -mm tree (the other 6 patches in this series can be kept) ? -- Thanks Yafang
On Mon, 29 Nov 2021 23:33:33 +0800 Yafang Shao <laoar.shao@gmail.com> wrote: > > TBH, i would vote for reverting the change. defining an array size as > > enum feels really odd. > > > > We changed it to enum because the BTF can't parse macro while it can > parse the enum type. I wonder if BTF could take advantage of the tracing: TRACE_DEFINE_ENUM() macros? This is how they are converted for user space tooling. Anyway, I'd have to go and look at why that trigger test failed. I don't see how the size of the array caused it to change the signage of value. -- Steve > Anyway I don't insist on keeping this change if you think reverting it > is better.
On Mon, 29 Nov 2021 11:07:55 -0500 Steven Rostedt <rostedt@goodmis.org> wrote: > I wonder if BTF could take advantage of the tracing: > > TRACE_DEFINE_ENUM() macros? Bah' BTF does handle enums, it doesn't handle macros. But I wonder if we could do something similar for BTF. That is, force it. -- Steve
On Mon, 29 Nov 2021 11:13:31 +0100 Sven Schnelle <svens@linux.ibm.com> wrote: > This breaks the trigger-field-variable-support.tc from the ftrace test > suite at least on s390: > > echo 'hist:keys=next_comm:wakeup_lat=common_timestamp.usecs-$ts0:onmatch(sched.sched_waking).wakeup_latency($wakeup_lat,next_pid,sched.sched_waking.prio,next_comm) if next_comm=="ping"' > linux/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-field-variable-support.tc: line 15: echo: write error: Invalid argument > > I added a debugging line into check_synth_field(): > > [ 44.091037] field->size 16, hist_field->size 16, field->is_signed 1, hist_field->is_signed 0 > > Note the difference in the signed field. That should not break on strings. Does this fix it (if you keep the patch)? -- Steve diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c index 9555b8e1d1e3..319f9c8ca7e7 100644 --- a/kernel/trace/trace_events_hist.c +++ b/kernel/trace/trace_events_hist.c @@ -3757,7 +3757,7 @@ static int check_synth_field(struct synth_event *event, if (strcmp(field->type, hist_field->type) != 0) { if (field->size != hist_field->size || - field->is_signed != hist_field->is_signed) + (!field->is_string && field->is_signed != hist_field->is_signed)) return -EINVAL; }
Steven Rostedt <rostedt@goodmis.org> writes: > On Mon, 29 Nov 2021 11:13:31 +0100 > Sven Schnelle <svens@linux.ibm.com> wrote: > > >> This breaks the trigger-field-variable-support.tc from the ftrace test >> suite at least on s390: >> >> echo >> 'hist:keys=next_comm:wakeup_lat=common_timestamp.usecs-$ts0:onmatch(sched.sched_waking).wakeup_latency($wakeup_lat,next_pid,sched.sched_waking.prio,next_comm) >> if next_comm=="ping"' >> linux/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-field-variable-support.tc: line 15: echo: write error: Invalid argument >> >> I added a debugging line into check_synth_field(): >> >> [ 44.091037] field->size 16, hist_field->size 16, field->is_signed 1, hist_field->is_signed 0 >> >> Note the difference in the signed field. > > That should not break on strings. > > Does this fix it (if you keep the patch)? It does. Thanks! > -- Steve > > diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c > index 9555b8e1d1e3..319f9c8ca7e7 100644 > --- a/kernel/trace/trace_events_hist.c > +++ b/kernel/trace/trace_events_hist.c > @@ -3757,7 +3757,7 @@ static int check_synth_field(struct synth_event *event, > > if (strcmp(field->type, hist_field->type) != 0) { > if (field->size != hist_field->size || > - field->is_signed != hist_field->is_signed) > + (!field->is_string && field->is_signed != hist_field->is_signed)) > return -EINVAL; > } >
On Tue, Nov 30, 2021 at 1:30 AM Steven Rostedt <rostedt@goodmis.org> wrote: > > On Mon, 29 Nov 2021 11:13:31 +0100 > Sven Schnelle <svens@linux.ibm.com> wrote: > > > > This breaks the trigger-field-variable-support.tc from the ftrace test > > suite at least on s390: > > > > echo 'hist:keys=next_comm:wakeup_lat=common_timestamp.usecs-$ts0:onmatch(sched.sched_waking).wakeup_latency($wakeup_lat,next_pid,sched.sched_waking.prio,next_comm) if next_comm=="ping"' > > linux/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-field-variable-support.tc: line 15: echo: write error: Invalid argument > > > > I added a debugging line into check_synth_field(): > > > > [ 44.091037] field->size 16, hist_field->size 16, field->is_signed 1, hist_field->is_signed 0 > > > > Note the difference in the signed field. > > That should not break on strings. > > Does this fix it (if you keep the patch)? > > -- Steve > > diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c > index 9555b8e1d1e3..319f9c8ca7e7 100644 > --- a/kernel/trace/trace_events_hist.c > +++ b/kernel/trace/trace_events_hist.c > @@ -3757,7 +3757,7 @@ static int check_synth_field(struct synth_event *event, > > if (strcmp(field->type, hist_field->type) != 0) { > if (field->size != hist_field->size || > - field->is_signed != hist_field->is_signed) > + (!field->is_string && field->is_signed != hist_field->is_signed)) > return -EINVAL; > } > Many thanks for the quick fix! It seems this fix should be ahead of patch #7. I will send v3 which contains your fix.
On Tue, 30 Nov 2021 11:03:48 +0800 Yafang Shao <laoar.shao@gmail.com> wrote: > Many thanks for the quick fix! > It seems this fix should be ahead of patch #7. > I will send v3 which contains your fix. Don't bother. I'm actually going to send this to Linus as a bug fix. -- Steve
On Tue, Nov 30, 2021 at 10:23 PM Steven Rostedt <rostedt@goodmis.org> wrote: > > On Tue, 30 Nov 2021 11:03:48 +0800 > Yafang Shao <laoar.shao@gmail.com> wrote: > > > Many thanks for the quick fix! > > It seems this fix should be ahead of patch #7. > > I will send v3 which contains your fix. > > Don't bother. I'm actually going to send this to Linus as a bug fix. > Great! Thanks for the work.
On Sat, Nov 20, 2021 at 11:27:38AM +0000, Yafang Shao wrote: > As the sched:sched_switch tracepoint args are derived from the kernel, > we'd better make it same with the kernel. So the macro TASK_COMM_LEN is > converted to type enum, then all the BPF programs can get it through BTF. > > The BPF program which wants to use TASK_COMM_LEN should include the header > vmlinux.h. Regarding the test_stacktrace_map and test_tracepoint, as the > type defined in linux/bpf.h are also defined in vmlinux.h, so we don't > need to include linux/bpf.h again. > > Signed-off-by: Yafang Shao <laoar.shao@gmail.com> > Acked-by: Andrii Nakryiko <andrii@kernel.org> > Acked-by: David Hildenbrand <david@redhat.com> > Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> > Cc: Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com> > Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com> > Cc: Michal Miroslaw <mirq-linux@rere.qmqm.pl> > Cc: Peter Zijlstra <peterz@infradead.org> > Cc: Steven Rostedt <rostedt@goodmis.org> > Cc: Matthew Wilcox <willy@infradead.org> > Cc: David Hildenbrand <david@redhat.com> > Cc: Al Viro <viro@zeniv.linux.org.uk> > Cc: Kees Cook <keescook@chromium.org> > Cc: Petr Mladek <pmladek@suse.com> > --- > include/linux/sched.h | 9 +++++++-- > tools/testing/selftests/bpf/progs/test_stacktrace_map.c | 6 +++--- > tools/testing/selftests/bpf/progs/test_tracepoint.c | 6 +++--- > 3 files changed, 13 insertions(+), 8 deletions(-) Hey all, I know this is a little late, but I recently got a report that this change was causiing older versions of perfetto to stop working. Apparently newer versions of perfetto has worked around this via the following changes: https://android.googlesource.com/platform/external/perfetto/+/c717c93131b1b6e3705a11092a70ac47c78b731d%5E%21/ https://android.googlesource.com/platform/external/perfetto/+/160a504ad5c91a227e55f84d3e5d3fe22af7c2bb%5E%21/ But for older versions of perfetto, reverting upstream commit 3087c61ed2c4 ("tools/testing/selftests/bpf: replace open-coded 16 with TASK_COMM_LEN") is necessary to get it back to working. I haven't dug very far into the details, and obviously this doesn't break with the updated perfetto, but from a high level this does seem to be a breaking-userland regression. So I wanted to reach out to see if there was more context for this breakage? I don't want to raise a unnecessary stink if this was an unfortuante but forced situation. thanks -john
On Wed, Feb 8, 2023 at 2:01 PM John Stultz <jstultz@google.com> wrote: > > On Sat, Nov 20, 2021 at 11:27:38AM +0000, Yafang Shao wrote: > > As the sched:sched_switch tracepoint args are derived from the kernel, > > we'd better make it same with the kernel. So the macro TASK_COMM_LEN is > > converted to type enum, then all the BPF programs can get it through BTF. > > > > The BPF program which wants to use TASK_COMM_LEN should include the header > > vmlinux.h. Regarding the test_stacktrace_map and test_tracepoint, as the > > type defined in linux/bpf.h are also defined in vmlinux.h, so we don't > > need to include linux/bpf.h again. > > > > Signed-off-by: Yafang Shao <laoar.shao@gmail.com> > > Acked-by: Andrii Nakryiko <andrii@kernel.org> > > Acked-by: David Hildenbrand <david@redhat.com> > > Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> > > Cc: Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com> > > Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com> > > Cc: Michal Miroslaw <mirq-linux@rere.qmqm.pl> > > Cc: Peter Zijlstra <peterz@infradead.org> > > Cc: Steven Rostedt <rostedt@goodmis.org> > > Cc: Matthew Wilcox <willy@infradead.org> > > Cc: David Hildenbrand <david@redhat.com> > > Cc: Al Viro <viro@zeniv.linux.org.uk> > > Cc: Kees Cook <keescook@chromium.org> > > Cc: Petr Mladek <pmladek@suse.com> > > --- > > include/linux/sched.h | 9 +++++++-- > > tools/testing/selftests/bpf/progs/test_stacktrace_map.c | 6 +++--- > > tools/testing/selftests/bpf/progs/test_tracepoint.c | 6 +++--- > > 3 files changed, 13 insertions(+), 8 deletions(-) > > Hey all, > I know this is a little late, but I recently got a report that > this change was causiing older versions of perfetto to stop > working. > > Apparently newer versions of perfetto has worked around this > via the following changes: > https://android.googlesource.com/platform/external/perfetto/+/c717c93131b1b6e3705a11092a70ac47c78b731d%5E%21/ > https://android.googlesource.com/platform/external/perfetto/+/160a504ad5c91a227e55f84d3e5d3fe22af7c2bb%5E%21/ > > But for older versions of perfetto, reverting upstream commit > 3087c61ed2c4 ("tools/testing/selftests/bpf: replace open-coded 16 > with TASK_COMM_LEN") is necessary to get it back to working. > > I haven't dug very far into the details, and obviously this doesn't > break with the updated perfetto, but from a high level this does > seem to be a breaking-userland regression. > > So I wanted to reach out to see if there was more context for this > breakage? I don't want to raise a unnecessary stink if this was > an unfortuante but forced situation. Let me understand what you're saying... The commit 3087c61ed2c4 did -/* Task command name length: */ -#define TASK_COMM_LEN 16 +/* + * Define the task command name length as enum, then it can be visible to + * BPF programs. + */ +enum { + TASK_COMM_LEN = 16, +}; and that caused: cat /sys/kernel/debug/tracing/events/task/task_newtask/format to print field:char comm[TASK_COMM_LEN]; offset:12; size:16; signed:0; instead of field:char comm[16]; offset:12; size:16; signed:0; so the ftrace parsing android tracing tool had to do: - if (Match(type_and_name.c_str(), R"(char [a-zA-Z_]+\[[0-9]+\])")) { + if (Match(type_and_name.c_str(), + R"(char [a-zA-Z_][a-zA-Z_0-9]*\[[a-zA-Z_0-9]+\])")) { to workaround this change. Right? And what are you proposing?
On Wed, Feb 8, 2023 at 4:11 PM Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > On Wed, Feb 8, 2023 at 2:01 PM John Stultz <jstultz@google.com> wrote: > > > > On Sat, Nov 20, 2021 at 11:27:38AM +0000, Yafang Shao wrote: > > > As the sched:sched_switch tracepoint args are derived from the kernel, > > > we'd better make it same with the kernel. So the macro TASK_COMM_LEN is > > > converted to type enum, then all the BPF programs can get it through BTF. > > > > > > The BPF program which wants to use TASK_COMM_LEN should include the header > > > vmlinux.h. Regarding the test_stacktrace_map and test_tracepoint, as the > > > type defined in linux/bpf.h are also defined in vmlinux.h, so we don't > > > need to include linux/bpf.h again. > > > > > > Signed-off-by: Yafang Shao <laoar.shao@gmail.com> > > > Acked-by: Andrii Nakryiko <andrii@kernel.org> > > > Acked-by: David Hildenbrand <david@redhat.com> > > > Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> > > > Cc: Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com> > > > Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com> > > > Cc: Michal Miroslaw <mirq-linux@rere.qmqm.pl> > > > Cc: Peter Zijlstra <peterz@infradead.org> > > > Cc: Steven Rostedt <rostedt@goodmis.org> > > > Cc: Matthew Wilcox <willy@infradead.org> > > > Cc: David Hildenbrand <david@redhat.com> > > > Cc: Al Viro <viro@zeniv.linux.org.uk> > > > Cc: Kees Cook <keescook@chromium.org> > > > Cc: Petr Mladek <pmladek@suse.com> > > > --- > > > include/linux/sched.h | 9 +++++++-- > > > tools/testing/selftests/bpf/progs/test_stacktrace_map.c | 6 +++--- > > > tools/testing/selftests/bpf/progs/test_tracepoint.c | 6 +++--- > > > 3 files changed, 13 insertions(+), 8 deletions(-) > > > > Hey all, > > I know this is a little late, but I recently got a report that > > this change was causiing older versions of perfetto to stop > > working. > > > > Apparently newer versions of perfetto has worked around this > > via the following changes: > > https://android.googlesource.com/platform/external/perfetto/+/c717c93131b1b6e3705a11092a70ac47c78b731d%5E%21/ > > https://android.googlesource.com/platform/external/perfetto/+/160a504ad5c91a227e55f84d3e5d3fe22af7c2bb%5E%21/ > > > > But for older versions of perfetto, reverting upstream commit > > 3087c61ed2c4 ("tools/testing/selftests/bpf: replace open-coded 16 > > with TASK_COMM_LEN") is necessary to get it back to working. > > > > I haven't dug very far into the details, and obviously this doesn't > > break with the updated perfetto, but from a high level this does > > seem to be a breaking-userland regression. > > > > So I wanted to reach out to see if there was more context for this > > breakage? I don't want to raise a unnecessary stink if this was > > an unfortuante but forced situation. > > Let me understand what you're saying... > > The commit 3087c61ed2c4 did > > -/* Task command name length: */ > -#define TASK_COMM_LEN 16 > +/* > + * Define the task command name length as enum, then it can be visible to > + * BPF programs. > + */ > +enum { > + TASK_COMM_LEN = 16, > +}; > > > and that caused: > > cat /sys/kernel/debug/tracing/events/task/task_newtask/format > > to print > field:char comm[TASK_COMM_LEN]; offset:12; size:16; signed:0; > instead of > field:char comm[16]; offset:12; size:16; signed:0; > > so the ftrace parsing android tracing tool had to do: > > - if (Match(type_and_name.c_str(), R"(char [a-zA-Z_]+\[[0-9]+\])")) { > + if (Match(type_and_name.c_str(), > + R"(char [a-zA-Z_][a-zA-Z_0-9]*\[[a-zA-Z_0-9]+\])")) { > > to workaround this change. > Right? I believe so. > And what are you proposing? I'm not proposing anything. I was just wanting to understand more context around this, as it outwardly appears to be a user-breaking change, and that is usually not done, so I figured it was an issue worth raising. If the debug/tracing/*/format output is in the murky not-really-abi space, that's fine, but I wanted to know if this was understood as something that may require userland updates or if this was a unexpected side-effect. thanks -john
On 2023-02-08 19:54, John Stultz wrote: > On Wed, Feb 8, 2023 at 4:11 PM Alexei Starovoitov > <alexei.starovoitov@gmail.com> wrote: >> >> On Wed, Feb 8, 2023 at 2:01 PM John Stultz <jstultz@google.com> wrote: >>> >>> On Sat, Nov 20, 2021 at 11:27:38AM +0000, Yafang Shao wrote: >>>> As the sched:sched_switch tracepoint args are derived from the kernel, >>>> we'd better make it same with the kernel. So the macro TASK_COMM_LEN is >>>> converted to type enum, then all the BPF programs can get it through BTF. >>>> >>>> The BPF program which wants to use TASK_COMM_LEN should include the header >>>> vmlinux.h. Regarding the test_stacktrace_map and test_tracepoint, as the >>>> type defined in linux/bpf.h are also defined in vmlinux.h, so we don't >>>> need to include linux/bpf.h again. >>>> >>>> Signed-off-by: Yafang Shao <laoar.shao@gmail.com> >>>> Acked-by: Andrii Nakryiko <andrii@kernel.org> >>>> Acked-by: David Hildenbrand <david@redhat.com> >>>> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> >>>> Cc: Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com> >>>> Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com> >>>> Cc: Michal Miroslaw <mirq-linux@rere.qmqm.pl> >>>> Cc: Peter Zijlstra <peterz@infradead.org> >>>> Cc: Steven Rostedt <rostedt@goodmis.org> >>>> Cc: Matthew Wilcox <willy@infradead.org> >>>> Cc: David Hildenbrand <david@redhat.com> >>>> Cc: Al Viro <viro@zeniv.linux.org.uk> >>>> Cc: Kees Cook <keescook@chromium.org> >>>> Cc: Petr Mladek <pmladek@suse.com> >>>> --- >>>> include/linux/sched.h | 9 +++++++-- >>>> tools/testing/selftests/bpf/progs/test_stacktrace_map.c | 6 +++--- >>>> tools/testing/selftests/bpf/progs/test_tracepoint.c | 6 +++--- >>>> 3 files changed, 13 insertions(+), 8 deletions(-) >>> >>> Hey all, >>> I know this is a little late, but I recently got a report that >>> this change was causiing older versions of perfetto to stop >>> working. >>> >>> Apparently newer versions of perfetto has worked around this >>> via the following changes: >>> https://android.googlesource.com/platform/external/perfetto/+/c717c93131b1b6e3705a11092a70ac47c78b731d%5E%21/ >>> https://android.googlesource.com/platform/external/perfetto/+/160a504ad5c91a227e55f84d3e5d3fe22af7c2bb%5E%21/ >>> >>> But for older versions of perfetto, reverting upstream commit >>> 3087c61ed2c4 ("tools/testing/selftests/bpf: replace open-coded 16 >>> with TASK_COMM_LEN") is necessary to get it back to working. >>> >>> I haven't dug very far into the details, and obviously this doesn't >>> break with the updated perfetto, but from a high level this does >>> seem to be a breaking-userland regression. >>> >>> So I wanted to reach out to see if there was more context for this >>> breakage? I don't want to raise a unnecessary stink if this was >>> an unfortuante but forced situation. >> >> Let me understand what you're saying... >> >> The commit 3087c61ed2c4 did >> >> -/* Task command name length: */ >> -#define TASK_COMM_LEN 16 >> +/* >> + * Define the task command name length as enum, then it can be visible to >> + * BPF programs. >> + */ >> +enum { >> + TASK_COMM_LEN = 16, >> +}; >> >> >> and that caused: >> >> cat /sys/kernel/debug/tracing/events/task/task_newtask/format >> >> to print >> field:char comm[TASK_COMM_LEN]; offset:12; size:16; signed:0; >> instead of >> field:char comm[16]; offset:12; size:16; signed:0; >> >> so the ftrace parsing android tracing tool had to do: >> >> - if (Match(type_and_name.c_str(), R"(char [a-zA-Z_]+\[[0-9]+\])")) { >> + if (Match(type_and_name.c_str(), >> + R"(char [a-zA-Z_][a-zA-Z_0-9]*\[[a-zA-Z_0-9]+\])")) { >> >> to workaround this change. >> Right? > > I believe so. > >> And what are you proposing? > > I'm not proposing anything. I was just wanting to understand more > context around this, as it outwardly appears to be a user-breaking > change, and that is usually not done, so I figured it was an issue > worth raising. > > If the debug/tracing/*/format output is in the murky not-really-abi > space, that's fine, but I wanted to know if this was understood as > something that may require userland updates or if this was a > unexpected side-effect. If you are looking at the root cause in the kernel code generating this: kernel/trace/trace_events.c:f_show() /* * Smartly shows the array type(except dynamic array). * Normal: * field:TYPE VAR * If TYPE := TYPE[LEN], it is shown: * field:TYPE VAR[LEN] */ where it uses the content of field->type (a string) to format the VAR[LEN] part. This in turn is the result of the definition of the struct trace_event_fields done in: include/trace/trace_events.h at stage 4, thus with the context of those macros defined: include/trace/stages/stage4_event_fields.h: #undef __array #define __array(_type, _item, _len) { \ .type = #_type"["__stringify(_len)"]", .name = #_item, \ .size = sizeof(_type[_len]), .align = ALIGN_STRUCTFIELD(_type), \ .is_signed = is_signed_type(_type), .filter_type = FILTER_OTHER }, I suspect the real culprit here is the use of __stringify(_len), which happens to work on macros, but not on enum labels. One possible solution to make this more robust would be to extend struct trace_event_fields with one more field that indicates the length of an array as an actual integer, without storing it in its stringified form in the type, and do the formatting in f_show where it belongs. This way everybody can stay happy and no ABI is broken. Thoughts ? Thanks, Mathieu
On Wed, 8 Feb 2023 16:54:03 -0800 John Stultz <jstultz@google.com> wrote: > > Let me understand what you're saying... > > > > The commit 3087c61ed2c4 did > > > > -/* Task command name length: */ > > -#define TASK_COMM_LEN 16 > > +/* > > + * Define the task command name length as enum, then it can be visible to > > + * BPF programs. > > + */ > > +enum { > > + TASK_COMM_LEN = 16, > > +}; > > > > > > and that caused: > > > > cat /sys/kernel/debug/tracing/events/task/task_newtask/format > > > > to print > > field:char comm[TASK_COMM_LEN]; offset:12; size:16; signed:0; Yes because there's no easy way to automatically convert an enum to a number. And this has been a macro for a very long time (so it works, because macros convert to numbers). And this breaks much more than android. It will break trace-cmd, rasdaemon and perf (if it's not using BTF). This change very much "Breaks userspace!" And requires a kernel workaround, not a user space one. > > instead of > > field:char comm[16]; offset:12; size:16; signed:0; > > > > so the ftrace parsing android tracing tool had to do: > > > > - if (Match(type_and_name.c_str(), R"(char [a-zA-Z_]+\[[0-9]+\])")) { > > + if (Match(type_and_name.c_str(), > > + R"(char [a-zA-Z_][a-zA-Z_0-9]*\[[a-zA-Z_0-9]+\])")) { > > > > to workaround this change. > > Right? > > I believe so. > > > And what are you proposing? > > I'm not proposing anything. I was just wanting to understand more > context around this, as it outwardly appears to be a user-breaking > change, and that is usually not done, so I figured it was an issue > worth raising. > > If the debug/tracing/*/format output is in the murky not-really-abi > space, that's fine, but I wanted to know if this was understood as > something that may require userland updates or if this was a > unexpected side-effect. Linus has already said that /sys/kernel/tracing/* is an ABI (fyi, getting to the tracing directory via debugfs is obsolete). Usually, when a trace event uses an enum, it can do: TRACE_DEFINE_ENUM(TASK_COMM_LEN); But that's a very common define. It would require it updated for every trace event, or the change needs to be reverted. Not sure why BTF needs it like this, because it hasn't changed in years. Can't it just hard code it? For ftrace to change it, it requires reading the format files at boot up and replacing the enums with the numbers, which does impact start up. -- Steve
On Wed, 8 Feb 2023 21:28:58 -0500 Steven Rostedt <rostedt@goodmis.org> wrote: > And this breaks much more than android. It will break trace-cmd, rasdaemon > and perf (if it's not using BTF). This change very much "Breaks userspace!" > And requires a kernel workaround, not a user space one. OK, so it doesn't break perf, trace-cmd and rasdaemon, because the enum is only needed in the print_fmt part. It can handle it in the field portion. That is: system: sched name: sched_switch ID: 285 format: field:unsigned short common_type; offset:0; size:2; signed:0; field:unsigned char common_flags; offset:2; size:1; signed:0; field:unsigned char common_preempt_count; offset:3; size:1; signed:0; field:int common_pid; offset:4; size:4; signed:1; field:char prev_comm[TASK_COMM_LEN]; offset:8; size:16; signed:0; ^^^^^^^^^^^^^^ ^^ is ignored is used field:pid_t prev_pid; offset:24; size:4; signed:1; field:int prev_prio; offset:28; size:4; signed:1; field:long prev_state; offset:32; size:8; signed:1; field:char next_comm[TASK_COMM_LEN]; offset:40; size:16; signed:0; field:pid_t next_pid; offset:56; size:4; signed:1; field:int next_prio; offset:60; size:4; signed:1; print fmt: "prev_comm=%s prev_pid=%d prev_prio=%d prev_state=%s%s ==> next_comm=%s next_pid=%d next_prio=%d", REC->prev_comm, REC->prev_pid, REC->prev_prio, (REC->prev_state & ((((0x00000000 | 0x00000001 | 0x00000002 | 0x00000004 | 0x00000008 | 0x00000010 | 0x00000020 | 0x00000040) + 1) << 1) - 1)) ? __print_flags(REC->prev_state & ((((0x00000000 | 0x00000001 | 0x00000002 | 0x00000004 | 0x00000008 | 0x00000010 | 0x00000020 | 0x00000040) + 1) << 1) - 1), "|", { 0x00000001, "S" }, { 0x00000002, "D" }, { 0x00000004, "T" }, { 0x00000008, "t" }, { 0x00000010, "X" }, { 0x00000020, "Z" }, { 0x00000040, "P" }, { 0x00000080, "I" }) : "R", REC->prev_state & (((0x00000000 | 0x00000001 | 0x00000002 | 0x00000004 | 0x00000008 | 0x00000010 | 0x00000020 | 0x00000040) + 1) << 1) ? "+" : "", REC->next_comm, REC->next_pid, REC->next_prio ^^^^^^^ Is what requires the conversions. So I take that back. It only breaks perfetto, and that's because it writes its own parser and doesn't use libtraceevent. -- Steve
On Thu, Feb 9, 2023 at 10:07 AM Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote: > > On 2023-02-08 19:54, John Stultz wrote: > > On Wed, Feb 8, 2023 at 4:11 PM Alexei Starovoitov > > <alexei.starovoitov@gmail.com> wrote: > >> > >> On Wed, Feb 8, 2023 at 2:01 PM John Stultz <jstultz@google.com> wrote: > >>> > >>> On Sat, Nov 20, 2021 at 11:27:38AM +0000, Yafang Shao wrote: > >>>> As the sched:sched_switch tracepoint args are derived from the kernel, > >>>> we'd better make it same with the kernel. So the macro TASK_COMM_LEN is > >>>> converted to type enum, then all the BPF programs can get it through BTF. > >>>> > >>>> The BPF program which wants to use TASK_COMM_LEN should include the header > >>>> vmlinux.h. Regarding the test_stacktrace_map and test_tracepoint, as the > >>>> type defined in linux/bpf.h are also defined in vmlinux.h, so we don't > >>>> need to include linux/bpf.h again. > >>>> > >>>> Signed-off-by: Yafang Shao <laoar.shao@gmail.com> > >>>> Acked-by: Andrii Nakryiko <andrii@kernel.org> > >>>> Acked-by: David Hildenbrand <david@redhat.com> > >>>> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> > >>>> Cc: Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com> > >>>> Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com> > >>>> Cc: Michal Miroslaw <mirq-linux@rere.qmqm.pl> > >>>> Cc: Peter Zijlstra <peterz@infradead.org> > >>>> Cc: Steven Rostedt <rostedt@goodmis.org> > >>>> Cc: Matthew Wilcox <willy@infradead.org> > >>>> Cc: David Hildenbrand <david@redhat.com> > >>>> Cc: Al Viro <viro@zeniv.linux.org.uk> > >>>> Cc: Kees Cook <keescook@chromium.org> > >>>> Cc: Petr Mladek <pmladek@suse.com> > >>>> --- > >>>> include/linux/sched.h | 9 +++++++-- > >>>> tools/testing/selftests/bpf/progs/test_stacktrace_map.c | 6 +++--- > >>>> tools/testing/selftests/bpf/progs/test_tracepoint.c | 6 +++--- > >>>> 3 files changed, 13 insertions(+), 8 deletions(-) > >>> > >>> Hey all, > >>> I know this is a little late, but I recently got a report that > >>> this change was causiing older versions of perfetto to stop > >>> working. > >>> > >>> Apparently newer versions of perfetto has worked around this > >>> via the following changes: > >>> https://android.googlesource.com/platform/external/perfetto/+/c717c93131b1b6e3705a11092a70ac47c78b731d%5E%21/ > >>> https://android.googlesource.com/platform/external/perfetto/+/160a504ad5c91a227e55f84d3e5d3fe22af7c2bb%5E%21/ > >>> > >>> But for older versions of perfetto, reverting upstream commit > >>> 3087c61ed2c4 ("tools/testing/selftests/bpf: replace open-coded 16 > >>> with TASK_COMM_LEN") is necessary to get it back to working. > >>> > >>> I haven't dug very far into the details, and obviously this doesn't > >>> break with the updated perfetto, but from a high level this does > >>> seem to be a breaking-userland regression. > >>> > >>> So I wanted to reach out to see if there was more context for this > >>> breakage? I don't want to raise a unnecessary stink if this was > >>> an unfortuante but forced situation. > >> > >> Let me understand what you're saying... > >> > >> The commit 3087c61ed2c4 did > >> > >> -/* Task command name length: */ > >> -#define TASK_COMM_LEN 16 > >> +/* > >> + * Define the task command name length as enum, then it can be visible to > >> + * BPF programs. > >> + */ > >> +enum { > >> + TASK_COMM_LEN = 16, > >> +}; > >> > >> > >> and that caused: > >> > >> cat /sys/kernel/debug/tracing/events/task/task_newtask/format > >> > >> to print > >> field:char comm[TASK_COMM_LEN]; offset:12; size:16; signed:0; > >> instead of > >> field:char comm[16]; offset:12; size:16; signed:0; > >> > >> so the ftrace parsing android tracing tool had to do: > >> > >> - if (Match(type_and_name.c_str(), R"(char [a-zA-Z_]+\[[0-9]+\])")) { > >> + if (Match(type_and_name.c_str(), > >> + R"(char [a-zA-Z_][a-zA-Z_0-9]*\[[a-zA-Z_0-9]+\])")) { > >> > >> to workaround this change. > >> Right? > > > > I believe so. > > > >> And what are you proposing? > > > > I'm not proposing anything. I was just wanting to understand more > > context around this, as it outwardly appears to be a user-breaking > > change, and that is usually not done, so I figured it was an issue > > worth raising. > > > > If the debug/tracing/*/format output is in the murky not-really-abi > > space, that's fine, but I wanted to know if this was understood as > > something that may require userland updates or if this was a > > unexpected side-effect. > > If you are looking at the root cause in the kernel code generating this: > > kernel/trace/trace_events.c:f_show() > > /* > * Smartly shows the array type(except dynamic array). > * Normal: > * field:TYPE VAR > * If TYPE := TYPE[LEN], it is shown: > * field:TYPE VAR[LEN] > */ > > where it uses the content of field->type (a string) to format the VAR[LEN] part. > > This in turn is the result of the definition of the > struct trace_event_fields done in: > > include/trace/trace_events.h at stage 4, thus with the context of those macros defined: > > include/trace/stages/stage4_event_fields.h: > > #undef __array > #define __array(_type, _item, _len) { \ > .type = #_type"["__stringify(_len)"]", .name = #_item, \ > .size = sizeof(_type[_len]), .align = ALIGN_STRUCTFIELD(_type), \ > .is_signed = is_signed_type(_type), .filter_type = FILTER_OTHER }, > > I suspect the real culprit here is the use of __stringify(_len), which happens to work > on macros, but not on enum labels. > > One possible solution to make this more robust would be to extend > struct trace_event_fields with one more field that indicates the length > of an array as an actual integer, without storing it in its stringified > form in the type, and do the formatting in f_show where it belongs. > > This way everybody can stay happy and no ABI is broken. > > Thoughts ? Many thanks for the detailed analysis. Seems it can work. Hi John, Could you pls. try the attached fix ? I have verified it in my test env. -- Regards Yafang
On Thu, Feb 09, 2023 at 02:20:36PM +0800, Yafang Shao wrote: [...] Hi Yafang, > Many thanks for the detailed analysis. Seems it can work. > > Hi John, > > Could you pls. try the attached fix ? I have verified it in my test env. I tested the patch on my environment where I found the issue with newer kernels + older Perfetto. The patch does improve things so that's nice. It goes from "not working at all" to "mostly working but missing data" compared to what happens if I just revert 3087c61ed2c48548b74dd343a5209b87082c682d. I'm just an end user so can't really speak to the underlying causes but for those more familiar with how Perfetto works this is what I'm getting: Error stats for this trace: name idx source value ---------------------------------------- ---------------------------------------- ---------------------------------------- ---------------------------------------- mismatched_sched_switch_tids [NULL] analysis 11101 systrace_parse_failure [NULL] analysis 19040 The trace explorer window ends up containing the ftrace-specific tracks but missing the tracks related to Android-specific callbacks and such. Debug stats below in case they're relevant: Name Value Type android_br_parse_errors 0 error (trace) android_log_format_invalid 0 error (trace) android_log_num_failed 0 error (trace) android_log_num_skipped 0 info (trace) android_log_num_total 0 info (trace) clock_sync_cache_miss 181 info (analysis) clock_sync_failure 0 error (analysis) compact_sched_has_parse_errors 0 error (trace) compact_sched_switch_skipped 0 info (analysis) compact_sched_waking_skipped 0 info (analysis) counter_events_out_of_order 0 error (analysis) deobfuscate_location_parse_error 0 error (trace) empty_chrome_metadata 0 error (trace) energy_breakdown_missing_values 0 error (analysis) energy_descriptor_invalid 0 error (analysis) energy_uid_breakdown_missing_values 0 error (analysis) flow_duplicate_id 0 error (trace) flow_end_without_start 0 info (trace) flow_invalid_id 0 error (trace) flow_no_enclosing_slice 0 error (trace) flow_step_without_start 0 info (trace) flow_without_direction 0 error (trace) frame_timeline_event_parser_errors 0 info (analysis) ftrace_bundle_tokenizer_errors 0 error (analysis) ftrace_cpu_bytes_read_begin[0] 0 info (trace) ftrace_cpu_bytes_read_begin[1] 264 info (trace) ftrace_cpu_bytes_read_begin[2] 0 info (trace) ftrace_cpu_bytes_read_begin[3] 224 info (trace) ftrace_cpu_bytes_read_begin[4] 0 info (trace) ftrace_cpu_bytes_read_begin[5] 0 info (trace) ftrace_cpu_bytes_read_begin[6] 0 info (trace) ftrace_cpu_bytes_read_begin[7] 0 info (trace) ftrace_cpu_bytes_read_delta[0] 6919836 info (trace) ftrace_cpu_bytes_read_delta[1] 7197556 info (trace) ftrace_cpu_bytes_read_delta[2] 6381828 info (trace) ftrace_cpu_bytes_read_delta[3] 5988336 info (trace) ftrace_cpu_bytes_read_delta[4] 5933528 info (trace) ftrace_cpu_bytes_read_delta[5] 4858400 info (trace) ftrace_cpu_bytes_read_delta[6] 6175260 info (trace) ftrace_cpu_bytes_read_delta[7] 4633460 info (trace) ftrace_cpu_bytes_read_end[0] 6919836 info (trace) ftrace_cpu_bytes_read_end[1] 7197820 info (trace) ftrace_cpu_bytes_read_end[2] 6381828 info (trace) ftrace_cpu_bytes_read_end[3] 5988560 info (trace) ftrace_cpu_bytes_read_end[4] 5933528 info (trace) ftrace_cpu_bytes_read_end[5] 4858400 info (trace) ftrace_cpu_bytes_read_end[6] 6175260 info (trace) ftrace_cpu_bytes_read_end[7] 4633460 info (trace) ftrace_cpu_commit_overrun_begin[0] 0 info (trace) ftrace_cpu_commit_overrun_begin[1] 0 info (trace) ftrace_cpu_commit_overrun_begin[2] 0 info (trace) ftrace_cpu_commit_overrun_begin[3] 0 info (trace) ftrace_cpu_commit_overrun_begin[4] 0 info (trace) ftrace_cpu_commit_overrun_begin[5] 0 info (trace) ftrace_cpu_commit_overrun_begin[6] 0 info (trace) ftrace_cpu_commit_overrun_begin[7] 0 info (trace) ftrace_cpu_commit_overrun_delta[0] 0 error (trace) ftrace_cpu_commit_overrun_delta[1] 0 error (trace) ftrace_cpu_commit_overrun_delta[2] 0 error (trace) ftrace_cpu_commit_overrun_delta[3] 0 error (trace) ftrace_cpu_commit_overrun_delta[4] 0 error (trace) ftrace_cpu_commit_overrun_delta[5] 0 error (trace) ftrace_cpu_commit_overrun_delta[6] 0 error (trace) ftrace_cpu_commit_overrun_delta[7] 0 error (trace) ftrace_cpu_commit_overrun_end[0] 0 info (trace) ftrace_cpu_commit_overrun_end[1] 0 info (trace) ftrace_cpu_commit_overrun_end[2] 0 info (trace) ftrace_cpu_commit_overrun_end[3] 0 info (trace) ftrace_cpu_commit_overrun_end[4] 0 info (trace) ftrace_cpu_commit_overrun_end[5] 0 info (trace) ftrace_cpu_commit_overrun_end[6] 0 info (trace) ftrace_cpu_commit_overrun_end[7] 0 info (trace) ftrace_cpu_dropped_events_begin[0] 0 info (trace) ftrace_cpu_dropped_events_begin[1] 0 info (trace) ftrace_cpu_dropped_events_begin[2] 0 info (trace) ftrace_cpu_dropped_events_begin[3] 0 info (trace) ftrace_cpu_dropped_events_begin[4] 0 info (trace) ftrace_cpu_dropped_events_begin[5] 0 info (trace) ftrace_cpu_dropped_events_begin[6] 0 info (trace) ftrace_cpu_dropped_events_begin[7] 0 info (trace) ftrace_cpu_dropped_events_delta[0] 0 error (trace) ftrace_cpu_dropped_events_delta[1] 0 error (trace) ftrace_cpu_dropped_events_delta[2] 0 error (trace) ftrace_cpu_dropped_events_delta[3] 0 error (trace) ftrace_cpu_dropped_events_delta[4] 0 error (trace) ftrace_cpu_dropped_events_delta[5] 0 error (trace) ftrace_cpu_dropped_events_delta[6] 0 error (trace) ftrace_cpu_dropped_events_delta[7] 0 error (trace) ftrace_cpu_dropped_events_end[0] 0 info (trace) ftrace_cpu_dropped_events_end[1] 0 info (trace) ftrace_cpu_dropped_events_end[2] 0 info (trace) ftrace_cpu_dropped_events_end[3] 0 info (trace) ftrace_cpu_dropped_events_end[4] 0 info (trace) ftrace_cpu_dropped_events_end[5] 0 info (trace) ftrace_cpu_dropped_events_end[6] 0 info (trace) ftrace_cpu_dropped_events_end[7] 0 info (trace) ftrace_cpu_entries_begin[0] 0 info (trace) ftrace_cpu_entries_begin[1] 6 info (trace) ftrace_cpu_entries_begin[2] 0 info (trace) ftrace_cpu_entries_begin[3] 5 info (trace) ftrace_cpu_entries_begin[4] 0 info (trace) ftrace_cpu_entries_begin[5] 0 info (trace) ftrace_cpu_entries_begin[6] 0 info (trace) ftrace_cpu_entries_begin[7] 0 info (trace) ftrace_cpu_entries_delta[0] 6 info (trace) ftrace_cpu_entries_delta[1] -6 info (trace) ftrace_cpu_entries_delta[2] 0 info (trace) ftrace_cpu_entries_delta[3] 2 info (trace) ftrace_cpu_entries_delta[4] 0 info (trace) ftrace_cpu_entries_delta[5] 0 info (trace) ftrace_cpu_entries_delta[6] 0 info (trace) ftrace_cpu_entries_delta[7] 0 info (trace) ftrace_cpu_entries_end[0] 6 info (trace) ftrace_cpu_entries_end[1] 0 info (trace) ftrace_cpu_entries_end[2] 0 info (trace) ftrace_cpu_entries_end[3] 7 info (trace) ftrace_cpu_entries_end[4] 0 info (trace) ftrace_cpu_entries_end[5] 0 info (trace) ftrace_cpu_entries_end[6] 0 info (trace) ftrace_cpu_entries_end[7] 0 info (trace) ftrace_cpu_now_ts_begin[0] 93305027000 info (trace) ftrace_cpu_now_ts_begin[1] 93305103000 info (trace) ftrace_cpu_now_ts_begin[2] 93305159000 info (trace) ftrace_cpu_now_ts_begin[3] 93305207000 info (trace) ftrace_cpu_now_ts_begin[4] 93305262000 info (trace) ftrace_cpu_now_ts_begin[5] 93305312000 info (trace) ftrace_cpu_now_ts_begin[6] 93305362000 info (trace) ftrace_cpu_now_ts_begin[7] 93305411000 info (trace) ftrace_cpu_now_ts_end[0] 282906571000 info (trace) ftrace_cpu_now_ts_end[1] 282906676000 info (trace) ftrace_cpu_now_ts_end[2] 282906738000 info (trace) ftrace_cpu_now_ts_end[3] 282906803000 info (trace) ftrace_cpu_now_ts_end[4] 282906863000 info (trace) ftrace_cpu_now_ts_end[5] 282906925000 info (trace) ftrace_cpu_now_ts_end[6] 282906987000 info (trace) ftrace_cpu_now_ts_end[7] 282907048000 info (trace) ftrace_cpu_oldest_event_ts_begin[0] 0 info (trace) ftrace_cpu_oldest_event_ts_begin[1] 93304642000 info (trace) ftrace_cpu_oldest_event_ts_begin[2] 0 info (trace) ftrace_cpu_oldest_event_ts_begin[3] 93304876000 info (trace) ftrace_cpu_oldest_event_ts_begin[4] 0 info (trace) ftrace_cpu_oldest_event_ts_begin[5] 0 info (trace) ftrace_cpu_oldest_event_ts_begin[6] 0 info (trace) ftrace_cpu_oldest_event_ts_begin[7] 0 info (trace) ftrace_cpu_oldest_event_ts_end[0] 282905715000 info (trace) ftrace_cpu_oldest_event_ts_end[1] 282903723000 info (trace) ftrace_cpu_oldest_event_ts_end[2] 282903881000 info (trace) ftrace_cpu_oldest_event_ts_end[3] 282816175000 info (trace) ftrace_cpu_oldest_event_ts_end[4] 282896619000 info (trace) ftrace_cpu_oldest_event_ts_end[5] 282884168000 info (trace) ftrace_cpu_oldest_event_ts_end[6] 282783221000 info (trace) ftrace_cpu_oldest_event_ts_end[7] 282880081000 info (trace) ftrace_cpu_overrun_begin[0] 0 info (trace) ftrace_cpu_overrun_begin[1] 0 info (trace) ftrace_cpu_overrun_begin[2] 0 info (trace) ftrace_cpu_overrun_begin[3] 0 info (trace) ftrace_cpu_overrun_begin[4] 0 info (trace) ftrace_cpu_overrun_begin[5] 0 info (trace) ftrace_cpu_overrun_begin[6] 0 info (trace) ftrace_cpu_overrun_begin[7] 0 info (trace) ftrace_cpu_overrun_delta[0]help_outline 0 data_loss (trace) ftrace_cpu_overrun_delta[1]help_outline 0 data_loss (trace) ftrace_cpu_overrun_delta[2]help_outline 0 data_loss (trace) ftrace_cpu_overrun_delta[3]help_outline 0 data_loss (trace) ftrace_cpu_overrun_delta[4]help_outline 0 data_loss (trace) ftrace_cpu_overrun_delta[5]help_outline 0 data_loss (trace) ftrace_cpu_overrun_delta[6]help_outline 0 data_loss (trace) ftrace_cpu_overrun_delta[7]help_outline 0 data_loss (trace) ftrace_cpu_overrun_end[0] 0 info (trace) ftrace_cpu_overrun_end[1] 0 info (trace) ftrace_cpu_overrun_end[2] 0 info (trace) ftrace_cpu_overrun_end[3] 0 info (trace) ftrace_cpu_overrun_end[4] 0 info (trace) ftrace_cpu_overrun_end[5] 0 info (trace) ftrace_cpu_overrun_end[6] 0 info (trace) ftrace_cpu_overrun_end[7] 0 info (trace) ftrace_cpu_read_events_begin[0] 0 info (trace) ftrace_cpu_read_events_begin[1] 0 info (trace) ftrace_cpu_read_events_begin[2] 0 info (trace) ftrace_cpu_read_events_begin[3] 0 info (trace) ftrace_cpu_read_events_begin[4] 0 info (trace) ftrace_cpu_read_events_begin[5] 0 info (trace) ftrace_cpu_read_events_begin[6] 0 info (trace) ftrace_cpu_read_events_begin[7] 0 info (trace) ftrace_cpu_read_events_delta[0] 454848 info (trace) ftrace_cpu_read_events_delta[1] 453484 info (trace) ftrace_cpu_read_events_delta[2] 386290 info (trace) ftrace_cpu_read_events_delta[3] 356432 info (trace) ftrace_cpu_read_events_delta[4] 393337 info (trace) ftrace_cpu_read_events_delta[5] 325244 info (trace) ftrace_cpu_read_events_delta[6] 392637 info (trace) ftrace_cpu_read_events_delta[7] 350623 info (trace) ftrace_cpu_read_events_end[0] 454848 info (trace) ftrace_cpu_read_events_end[1] 453484 info (trace) ftrace_cpu_read_events_end[2] 386290 info (trace) ftrace_cpu_read_events_end[3] 356432 info (trace) ftrace_cpu_read_events_end[4] 393337 info (trace) ftrace_cpu_read_events_end[5] 325244 info (trace) ftrace_cpu_read_events_end[6] 392637 info (trace) ftrace_cpu_read_events_end[7] 350623 info (trace) ftrace_packet_before_tracing_starthelp_outline 0 info (analysis) ftrace_setup_errorshelp_outline 0 error (trace) fuchsia_invalid_event 0 error (analysis) fuchsia_non_numeric_counters 0 error (analysis) fuchsia_timestamp_overflow 0 error (analysis) game_intervention_has_parse_errorshelp_outline 0 error (trace) game_intervention_has_read_errorshelp_outline 0 error (trace) gpu_counters_invalid_spec 0 error (analysis) gpu_counters_missing_spec 0 error (analysis) gpu_render_stage_parser_errors 0 error (analysis) graphics_frame_event_parser_errors 0 info (analysis) guess_trace_type_duration_ns 7654 info (analysis) heap_graph_non_finalized_graph 0 error (trace) heapprofd_missing_packet 0 error (trace) heapprofd_non_finalized_profile 0 error (trace) interned_data_tokenizer_errors 0 info (analysis) invalid_clock_snapshots 0 error (analysis) invalid_cpu_times 0 error (analysis) json_display_time_unithelp_outline 0 info (trace) json_parser_failure 0 error (trace) json_tokenizer_failure 0 error (trace) meminfo_unknown_keys 0 error (analysis) memory_snapshot_parser_failure 0 error (analysis) metatrace_overruns 0 error (trace) mismatched_sched_switch_tids 11101 error (analysis) misplaced_end_event 0 data_loss (analysis) mm_unknown_type 0 error (analysis) ninja_parse_errors 0 error (trace) packages_list_has_parse_errors 0 error (trace) packages_list_has_read_errors 0 error (trace) parse_trace_duration_ns 1780589548 info (analysis) perf_samples_skipped 0 info (trace) perf_samples_skipped_dataloss 0 data_loss (trace) power_rail_unknown_index 0 error (trace) proc_stat_unknown_counters 0 error (analysis) process_tracker_errors 0 error (analysis) rss_stat_negative_size 0 info (analysis) rss_stat_unknown_keys 0 error (analysis) rss_stat_unknown_thread_for_mm_id 0 info (analysis) sched_switch_out_of_order 0 error (analysis) sched_waking_out_of_order 0 error (analysis) slice_out_of_order 0 error (analysis) sorter_push_event_out_of_orderhelp_outline 0 error (trace) stackprofile_invalid_callstack_id 0 error (trace) stackprofile_invalid_frame_id 0 error (trace) stackprofile_invalid_mapping_id 0 error (trace) stackprofile_invalid_string_id 0 error (trace) stackprofile_parser_error 0 error (trace) symbolization_tmp_build_id_not_foundhelp_outline 0 error (analysis) systrace_parse_failure 19040 error (analysis) task_state_invalid 0 error (analysis) thread_time_in_state_out_of_order 0 error (analysis) thread_time_in_state_unknown_cpu_freq 0 error (analysis) tokenizer_skipped_packets 0 info (analysis) traced_buf_abi_violations[0] 0 data_loss (trace) traced_buf_abi_violations[1] 0 data_loss (trace) traced_buf_buffer_size[0] 534773760 info (trace) traced_buf_buffer_size[1] 2097152 info (trace) traced_buf_bytes_overwritten[0] 0 info (trace) traced_buf_bytes_overwritten[1] 0 info (trace) traced_buf_bytes_read[0] 78929920 info (trace) traced_buf_bytes_read[1] 425984 info (trace) traced_buf_bytes_written[0] 78962688 info (trace) traced_buf_bytes_written[1] 425984 info (trace) traced_buf_chunks_committed_out_of_order[0] 0 info (trace) traced_buf_chunks_committed_out_of_order[1] 0 info (trace) traced_buf_chunks_discarded[0] 0 info (trace) traced_buf_chunks_discarded[1] 0 info (trace) traced_buf_chunks_overwritten[0] 0 info (trace) traced_buf_chunks_overwritten[1] 0 info (trace) traced_buf_chunks_read[0] 2428 info (trace) traced_buf_chunks_read[1] 13 info (trace) traced_buf_chunks_rewritten[0] 6 info (trace) traced_buf_chunks_rewritten[1] 0 info (trace) traced_buf_chunks_written[0] 2429 info (trace) traced_buf_chunks_written[1] 13 info (trace) traced_buf_padding_bytes_cleared[0] 0 info (trace) traced_buf_padding_bytes_cleared[1] 0 info (trace) traced_buf_padding_bytes_written[0] 0 info (trace) traced_buf_padding_bytes_written[1] 0 info (trace) traced_buf_patches_failed[0] 0 data_loss (trace) traced_buf_patches_failed[1] 0 data_loss (trace) traced_buf_patches_succeeded[0] 5633 info (trace) traced_buf_patches_succeeded[1] 8 info (trace) traced_buf_readaheads_failed[0] 115 info (trace) traced_buf_readaheads_failed[1] 18 info (trace) traced_buf_readaheads_succeeded[0] 2257 info (trace) traced_buf_readaheads_succeeded[1] 6 info (trace) traced_buf_trace_writer_packet_loss[0] 0 data_loss (trace) traced_buf_trace_writer_packet_loss[1] 0 data_loss (trace) traced_buf_write_wrap_count[0] 0 info (trace) traced_buf_write_wrap_count[1] 0 info (trace) traced_chunks_discarded 0 info (trace) traced_data_sources_registered 16 info (trace) traced_data_sources_seen 6 info (trace) traced_final_flush_failed 0 data_loss (trace) traced_final_flush_succeeded 0 info (trace) traced_flushes_failed 0 data_loss (trace) traced_flushes_requested 0 info (trace) traced_flushes_succeeded 0 info (trace) traced_patches_discarded 0 info (trace) traced_producers_connected 3 info (trace) traced_producers_seen 3 info (trace) traced_total_buffers 2 info (trace) traced_tracing_sessions 1 info (trace) track_event_dropped_packets_outside_of_range_of_interesthelp_outline 0 info (analysis) track_event_parser_errors 0 info (analysis) track_event_thread_invalid_endhelp_outline 0 error (trace) track_event_tokenizer_errors 0 info (analysis) truncated_sys_write_durationhelp_outline 0 data_loss (analysis) unknown_extension_fieldshelp_outline 0 error (trace) vmstat_unknown_keys 0 error (analysis) vulkan_allocations_invalid_string_id 0 error (trace) > -- > Regards > Yafang
On Thu, Feb 9, 2023 at 10:28 PM Kajetan Puchalski <kajetan.puchalski@arm.com> wrote: > > On Thu, Feb 09, 2023 at 02:20:36PM +0800, Yafang Shao wrote: > > [...] > > Hi Yafang, > > > Many thanks for the detailed analysis. Seems it can work. > > > > Hi John, > > > > Could you pls. try the attached fix ? I have verified it in my test env. > > I tested the patch on my environment where I found the issue with newer > kernels + older Perfetto. The patch does improve things so that's nice. Thanks for the test. I don't have Perfetto in hand, so I haven't verify Perfetto. > It goes from "not working at all" to "mostly working but missing data" > compared to what happens if I just revert 3087c61ed2c48548b74dd343a5209b87082c682d. > Do you mean there are no errors at all if revert 3087c61ed2c48548b74dd343a5209b87082c682d ? > I'm just an end user so can't really speak to the underlying causes but > for those more familiar with how Perfetto works this is what I'm getting: > The sched_switch tracepoint format file has the same output with reverting the commit, $ cat /sys/kernel/debug/tracing/events/sched/sched_switch/format name: sched_switch ID: 286 format: field:unsigned short common_type; offset:0; size:2; signed:0; field:unsigned char common_flags; offset:2; size:1; signed:0; field:unsigned char common_preempt_count; offset:3; size:1; signed:0; field:int common_pid; offset:4; size:4; signed:1; field:char prev_comm[16]; offset:8; size:16; signed:0; field:pid_t prev_pid; offset:24; size:4; signed:1; field:int prev_prio; offset:28; size:4; signed:1; field:long prev_state; offset:32; size:8; signed:1; field:char next_comm[16]; offset:40; size:16; signed:0; field:pid_t next_pid; offset:56; size:4; signed:1; field:int next_prio; offset:60; size:4; signed:1; print fmt: "prev_comm=%s prev_pid=%d prev_prio=%d prev_state=%s%s ==> next_comm=%s next_pid=%d next_prio=%d", REC->prev_comm, REC->prev_pid, REC->prev_prio, (REC->prev_state & ((((0x00000000 | 0x00000001 | 0x00000002 | 0x00000004 | 0x00000008 | 0x00000010 | 0x00000020 | 0x00000040) + 1) << 1) - 1)) ? __print_flags(REC->prev_state & ((((0x00000000 | 0x00000001 | 0x00000002 | 0x00000004 | 0x00000008 | 0x00000010 | 0x00000020 | 0x00000040) + 1) << 1) - 1), "|", { 0x00000001, "S" }, { 0x00000002, "D" }, { 0x00000004, "T" }, { 0x00000008, "t" }, { 0x00000010, "X" }, { 0x00000020, "Z" }, { 0x00000040, "P" }, { 0x00000080, "I" }) : "R", REC->prev_state & (((0x00000000 | 0x00000001 | 0x00000002 | 0x00000004 | 0x00000008 | 0x00000010 | 0x00000020 | 0x00000040) + 1) << 1) ? "+" : "", REC->next_comm, REC->next_pid, REC->next_prio So may be these errors were caused by other issues ? > Error stats for this trace: > name idx source value > ---------------------------------------- ---------------------------------------- ---------------------------------------- ---------------------------------------- > mismatched_sched_switch_tids [NULL] analysis 11101 > systrace_parse_failure [NULL] analysis 19040 > > The trace explorer window ends up containing the ftrace-specific tracks > but missing the tracks related to Android-specific callbacks and such. > > Debug stats below in case they're relevant: > > Name Value Type > android_br_parse_errors 0 error (trace) > android_log_format_invalid 0 error (trace) > android_log_num_failed 0 error (trace) > android_log_num_skipped 0 info (trace) > android_log_num_total 0 info (trace) > clock_sync_cache_miss 181 info (analysis) > clock_sync_failure 0 error (analysis) > compact_sched_has_parse_errors 0 error (trace) > compact_sched_switch_skipped 0 info (analysis) > compact_sched_waking_skipped 0 info (analysis) > counter_events_out_of_order 0 error (analysis) > deobfuscate_location_parse_error 0 error (trace) > empty_chrome_metadata 0 error (trace) > energy_breakdown_missing_values 0 error (analysis) > energy_descriptor_invalid 0 error (analysis) > energy_uid_breakdown_missing_values 0 error (analysis) > flow_duplicate_id 0 error (trace) > flow_end_without_start 0 info (trace) > flow_invalid_id 0 error (trace) > flow_no_enclosing_slice 0 error (trace) > flow_step_without_start 0 info (trace) > flow_without_direction 0 error (trace) > frame_timeline_event_parser_errors 0 info (analysis) > ftrace_bundle_tokenizer_errors 0 error (analysis) > ftrace_cpu_bytes_read_begin[0] 0 info (trace) > ftrace_cpu_bytes_read_begin[1] 264 info (trace) > ftrace_cpu_bytes_read_begin[2] 0 info (trace) > ftrace_cpu_bytes_read_begin[3] 224 info (trace) > ftrace_cpu_bytes_read_begin[4] 0 info (trace) > ftrace_cpu_bytes_read_begin[5] 0 info (trace) > ftrace_cpu_bytes_read_begin[6] 0 info (trace) > ftrace_cpu_bytes_read_begin[7] 0 info (trace) > ftrace_cpu_bytes_read_delta[0] 6919836 info (trace) > ftrace_cpu_bytes_read_delta[1] 7197556 info (trace) > ftrace_cpu_bytes_read_delta[2] 6381828 info (trace) > ftrace_cpu_bytes_read_delta[3] 5988336 info (trace) > ftrace_cpu_bytes_read_delta[4] 5933528 info (trace) > ftrace_cpu_bytes_read_delta[5] 4858400 info (trace) > ftrace_cpu_bytes_read_delta[6] 6175260 info (trace) > ftrace_cpu_bytes_read_delta[7] 4633460 info (trace) > ftrace_cpu_bytes_read_end[0] 6919836 info (trace) > ftrace_cpu_bytes_read_end[1] 7197820 info (trace) > ftrace_cpu_bytes_read_end[2] 6381828 info (trace) > ftrace_cpu_bytes_read_end[3] 5988560 info (trace) > ftrace_cpu_bytes_read_end[4] 5933528 info (trace) > ftrace_cpu_bytes_read_end[5] 4858400 info (trace) > ftrace_cpu_bytes_read_end[6] 6175260 info (trace) > ftrace_cpu_bytes_read_end[7] 4633460 info (trace) > ftrace_cpu_commit_overrun_begin[0] 0 info (trace) > ftrace_cpu_commit_overrun_begin[1] 0 info (trace) > ftrace_cpu_commit_overrun_begin[2] 0 info (trace) > ftrace_cpu_commit_overrun_begin[3] 0 info (trace) > ftrace_cpu_commit_overrun_begin[4] 0 info (trace) > ftrace_cpu_commit_overrun_begin[5] 0 info (trace) > ftrace_cpu_commit_overrun_begin[6] 0 info (trace) > ftrace_cpu_commit_overrun_begin[7] 0 info (trace) > ftrace_cpu_commit_overrun_delta[0] 0 error (trace) > ftrace_cpu_commit_overrun_delta[1] 0 error (trace) > ftrace_cpu_commit_overrun_delta[2] 0 error (trace) > ftrace_cpu_commit_overrun_delta[3] 0 error (trace) > ftrace_cpu_commit_overrun_delta[4] 0 error (trace) > ftrace_cpu_commit_overrun_delta[5] 0 error (trace) > ftrace_cpu_commit_overrun_delta[6] 0 error (trace) > ftrace_cpu_commit_overrun_delta[7] 0 error (trace) > ftrace_cpu_commit_overrun_end[0] 0 info (trace) > ftrace_cpu_commit_overrun_end[1] 0 info (trace) > ftrace_cpu_commit_overrun_end[2] 0 info (trace) > ftrace_cpu_commit_overrun_end[3] 0 info (trace) > ftrace_cpu_commit_overrun_end[4] 0 info (trace) > ftrace_cpu_commit_overrun_end[5] 0 info (trace) > ftrace_cpu_commit_overrun_end[6] 0 info (trace) > ftrace_cpu_commit_overrun_end[7] 0 info (trace) > ftrace_cpu_dropped_events_begin[0] 0 info (trace) > ftrace_cpu_dropped_events_begin[1] 0 info (trace) > ftrace_cpu_dropped_events_begin[2] 0 info (trace) > ftrace_cpu_dropped_events_begin[3] 0 info (trace) > ftrace_cpu_dropped_events_begin[4] 0 info (trace) > ftrace_cpu_dropped_events_begin[5] 0 info (trace) > ftrace_cpu_dropped_events_begin[6] 0 info (trace) > ftrace_cpu_dropped_events_begin[7] 0 info (trace) > ftrace_cpu_dropped_events_delta[0] 0 error (trace) > ftrace_cpu_dropped_events_delta[1] 0 error (trace) > ftrace_cpu_dropped_events_delta[2] 0 error (trace) > ftrace_cpu_dropped_events_delta[3] 0 error (trace) > ftrace_cpu_dropped_events_delta[4] 0 error (trace) > ftrace_cpu_dropped_events_delta[5] 0 error (trace) > ftrace_cpu_dropped_events_delta[6] 0 error (trace) > ftrace_cpu_dropped_events_delta[7] 0 error (trace) > ftrace_cpu_dropped_events_end[0] 0 info (trace) > ftrace_cpu_dropped_events_end[1] 0 info (trace) > ftrace_cpu_dropped_events_end[2] 0 info (trace) > ftrace_cpu_dropped_events_end[3] 0 info (trace) > ftrace_cpu_dropped_events_end[4] 0 info (trace) > ftrace_cpu_dropped_events_end[5] 0 info (trace) > ftrace_cpu_dropped_events_end[6] 0 info (trace) > ftrace_cpu_dropped_events_end[7] 0 info (trace) > ftrace_cpu_entries_begin[0] 0 info (trace) > ftrace_cpu_entries_begin[1] 6 info (trace) > ftrace_cpu_entries_begin[2] 0 info (trace) > ftrace_cpu_entries_begin[3] 5 info (trace) > ftrace_cpu_entries_begin[4] 0 info (trace) > ftrace_cpu_entries_begin[5] 0 info (trace) > ftrace_cpu_entries_begin[6] 0 info (trace) > ftrace_cpu_entries_begin[7] 0 info (trace) > ftrace_cpu_entries_delta[0] 6 info (trace) > ftrace_cpu_entries_delta[1] -6 info (trace) > ftrace_cpu_entries_delta[2] 0 info (trace) > ftrace_cpu_entries_delta[3] 2 info (trace) > ftrace_cpu_entries_delta[4] 0 info (trace) > ftrace_cpu_entries_delta[5] 0 info (trace) > ftrace_cpu_entries_delta[6] 0 info (trace) > ftrace_cpu_entries_delta[7] 0 info (trace) > ftrace_cpu_entries_end[0] 6 info (trace) > ftrace_cpu_entries_end[1] 0 info (trace) > ftrace_cpu_entries_end[2] 0 info (trace) > ftrace_cpu_entries_end[3] 7 info (trace) > ftrace_cpu_entries_end[4] 0 info (trace) > ftrace_cpu_entries_end[5] 0 info (trace) > ftrace_cpu_entries_end[6] 0 info (trace) > ftrace_cpu_entries_end[7] 0 info (trace) > ftrace_cpu_now_ts_begin[0] 93305027000 info (trace) > ftrace_cpu_now_ts_begin[1] 93305103000 info (trace) > ftrace_cpu_now_ts_begin[2] 93305159000 info (trace) > ftrace_cpu_now_ts_begin[3] 93305207000 info (trace) > ftrace_cpu_now_ts_begin[4] 93305262000 info (trace) > ftrace_cpu_now_ts_begin[5] 93305312000 info (trace) > ftrace_cpu_now_ts_begin[6] 93305362000 info (trace) > ftrace_cpu_now_ts_begin[7] 93305411000 info (trace) > ftrace_cpu_now_ts_end[0] 282906571000 info (trace) > ftrace_cpu_now_ts_end[1] 282906676000 info (trace) > ftrace_cpu_now_ts_end[2] 282906738000 info (trace) > ftrace_cpu_now_ts_end[3] 282906803000 info (trace) > ftrace_cpu_now_ts_end[4] 282906863000 info (trace) > ftrace_cpu_now_ts_end[5] 282906925000 info (trace) > ftrace_cpu_now_ts_end[6] 282906987000 info (trace) > ftrace_cpu_now_ts_end[7] 282907048000 info (trace) > ftrace_cpu_oldest_event_ts_begin[0] 0 info (trace) > ftrace_cpu_oldest_event_ts_begin[1] 93304642000 info (trace) > ftrace_cpu_oldest_event_ts_begin[2] 0 info (trace) > ftrace_cpu_oldest_event_ts_begin[3] 93304876000 info (trace) > ftrace_cpu_oldest_event_ts_begin[4] 0 info (trace) > ftrace_cpu_oldest_event_ts_begin[5] 0 info (trace) > ftrace_cpu_oldest_event_ts_begin[6] 0 info (trace) > ftrace_cpu_oldest_event_ts_begin[7] 0 info (trace) > ftrace_cpu_oldest_event_ts_end[0] 282905715000 info (trace) > ftrace_cpu_oldest_event_ts_end[1] 282903723000 info (trace) > ftrace_cpu_oldest_event_ts_end[2] 282903881000 info (trace) > ftrace_cpu_oldest_event_ts_end[3] 282816175000 info (trace) > ftrace_cpu_oldest_event_ts_end[4] 282896619000 info (trace) > ftrace_cpu_oldest_event_ts_end[5] 282884168000 info (trace) > ftrace_cpu_oldest_event_ts_end[6] 282783221000 info (trace) > ftrace_cpu_oldest_event_ts_end[7] 282880081000 info (trace) > ftrace_cpu_overrun_begin[0] 0 info (trace) > ftrace_cpu_overrun_begin[1] 0 info (trace) > ftrace_cpu_overrun_begin[2] 0 info (trace) > ftrace_cpu_overrun_begin[3] 0 info (trace) > ftrace_cpu_overrun_begin[4] 0 info (trace) > ftrace_cpu_overrun_begin[5] 0 info (trace) > ftrace_cpu_overrun_begin[6] 0 info (trace) > ftrace_cpu_overrun_begin[7] 0 info (trace) > ftrace_cpu_overrun_delta[0]help_outline 0 data_loss (trace) > ftrace_cpu_overrun_delta[1]help_outline 0 data_loss (trace) > ftrace_cpu_overrun_delta[2]help_outline 0 data_loss (trace) > ftrace_cpu_overrun_delta[3]help_outline 0 data_loss (trace) > ftrace_cpu_overrun_delta[4]help_outline 0 data_loss (trace) > ftrace_cpu_overrun_delta[5]help_outline 0 data_loss (trace) > ftrace_cpu_overrun_delta[6]help_outline 0 data_loss (trace) > ftrace_cpu_overrun_delta[7]help_outline 0 data_loss (trace) > ftrace_cpu_overrun_end[0] 0 info (trace) > ftrace_cpu_overrun_end[1] 0 info (trace) > ftrace_cpu_overrun_end[2] 0 info (trace) > ftrace_cpu_overrun_end[3] 0 info (trace) > ftrace_cpu_overrun_end[4] 0 info (trace) > ftrace_cpu_overrun_end[5] 0 info (trace) > ftrace_cpu_overrun_end[6] 0 info (trace) > ftrace_cpu_overrun_end[7] 0 info (trace) > ftrace_cpu_read_events_begin[0] 0 info (trace) > ftrace_cpu_read_events_begin[1] 0 info (trace) > ftrace_cpu_read_events_begin[2] 0 info (trace) > ftrace_cpu_read_events_begin[3] 0 info (trace) > ftrace_cpu_read_events_begin[4] 0 info (trace) > ftrace_cpu_read_events_begin[5] 0 info (trace) > ftrace_cpu_read_events_begin[6] 0 info (trace) > ftrace_cpu_read_events_begin[7] 0 info (trace) > ftrace_cpu_read_events_delta[0] 454848 info (trace) > ftrace_cpu_read_events_delta[1] 453484 info (trace) > ftrace_cpu_read_events_delta[2] 386290 info (trace) > ftrace_cpu_read_events_delta[3] 356432 info (trace) > ftrace_cpu_read_events_delta[4] 393337 info (trace) > ftrace_cpu_read_events_delta[5] 325244 info (trace) > ftrace_cpu_read_events_delta[6] 392637 info (trace) > ftrace_cpu_read_events_delta[7] 350623 info (trace) > ftrace_cpu_read_events_end[0] 454848 info (trace) > ftrace_cpu_read_events_end[1] 453484 info (trace) > ftrace_cpu_read_events_end[2] 386290 info (trace) > ftrace_cpu_read_events_end[3] 356432 info (trace) > ftrace_cpu_read_events_end[4] 393337 info (trace) > ftrace_cpu_read_events_end[5] 325244 info (trace) > ftrace_cpu_read_events_end[6] 392637 info (trace) > ftrace_cpu_read_events_end[7] 350623 info (trace) > ftrace_packet_before_tracing_starthelp_outline 0 info (analysis) > ftrace_setup_errorshelp_outline 0 error (trace) > fuchsia_invalid_event 0 error (analysis) > fuchsia_non_numeric_counters 0 error (analysis) > fuchsia_timestamp_overflow 0 error (analysis) > game_intervention_has_parse_errorshelp_outline 0 error (trace) > game_intervention_has_read_errorshelp_outline 0 error (trace) > gpu_counters_invalid_spec 0 error (analysis) > gpu_counters_missing_spec 0 error (analysis) > gpu_render_stage_parser_errors 0 error (analysis) > graphics_frame_event_parser_errors 0 info (analysis) > guess_trace_type_duration_ns 7654 info (analysis) > heap_graph_non_finalized_graph 0 error (trace) > heapprofd_missing_packet 0 error (trace) > heapprofd_non_finalized_profile 0 error (trace) > interned_data_tokenizer_errors 0 info (analysis) > invalid_clock_snapshots 0 error (analysis) > invalid_cpu_times 0 error (analysis) > json_display_time_unithelp_outline 0 info (trace) > json_parser_failure 0 error (trace) > json_tokenizer_failure 0 error (trace) > meminfo_unknown_keys 0 error (analysis) > memory_snapshot_parser_failure 0 error (analysis) > metatrace_overruns 0 error (trace) > mismatched_sched_switch_tids 11101 error (analysis) > misplaced_end_event 0 data_loss (analysis) > mm_unknown_type 0 error (analysis) > ninja_parse_errors 0 error (trace) > packages_list_has_parse_errors 0 error (trace) > packages_list_has_read_errors 0 error (trace) > parse_trace_duration_ns 1780589548 info (analysis) > perf_samples_skipped 0 info (trace) > perf_samples_skipped_dataloss 0 data_loss (trace) > power_rail_unknown_index 0 error (trace) > proc_stat_unknown_counters 0 error (analysis) > process_tracker_errors 0 error (analysis) > rss_stat_negative_size 0 info (analysis) > rss_stat_unknown_keys 0 error (analysis) > rss_stat_unknown_thread_for_mm_id 0 info (analysis) > sched_switch_out_of_order 0 error (analysis) > sched_waking_out_of_order 0 error (analysis) > slice_out_of_order 0 error (analysis) > sorter_push_event_out_of_orderhelp_outline 0 error (trace) > stackprofile_invalid_callstack_id 0 error (trace) > stackprofile_invalid_frame_id 0 error (trace) > stackprofile_invalid_mapping_id 0 error (trace) > stackprofile_invalid_string_id 0 error (trace) > stackprofile_parser_error 0 error (trace) > symbolization_tmp_build_id_not_foundhelp_outline 0 error (analysis) > systrace_parse_failure 19040 error (analysis) > task_state_invalid 0 error (analysis) > thread_time_in_state_out_of_order 0 error (analysis) > thread_time_in_state_unknown_cpu_freq 0 error (analysis) > tokenizer_skipped_packets 0 info (analysis) > traced_buf_abi_violations[0] 0 data_loss (trace) > traced_buf_abi_violations[1] 0 data_loss (trace) > traced_buf_buffer_size[0] 534773760 info (trace) > traced_buf_buffer_size[1] 2097152 info (trace) > traced_buf_bytes_overwritten[0] 0 info (trace) > traced_buf_bytes_overwritten[1] 0 info (trace) > traced_buf_bytes_read[0] 78929920 info (trace) > traced_buf_bytes_read[1] 425984 info (trace) > traced_buf_bytes_written[0] 78962688 info (trace) > traced_buf_bytes_written[1] 425984 info (trace) > traced_buf_chunks_committed_out_of_order[0] 0 info (trace) > traced_buf_chunks_committed_out_of_order[1] 0 info (trace) > traced_buf_chunks_discarded[0] 0 info (trace) > traced_buf_chunks_discarded[1] 0 info (trace) > traced_buf_chunks_overwritten[0] 0 info (trace) > traced_buf_chunks_overwritten[1] 0 info (trace) > traced_buf_chunks_read[0] 2428 info (trace) > traced_buf_chunks_read[1] 13 info (trace) > traced_buf_chunks_rewritten[0] 6 info (trace) > traced_buf_chunks_rewritten[1] 0 info (trace) > traced_buf_chunks_written[0] 2429 info (trace) > traced_buf_chunks_written[1] 13 info (trace) > traced_buf_padding_bytes_cleared[0] 0 info (trace) > traced_buf_padding_bytes_cleared[1] 0 info (trace) > traced_buf_padding_bytes_written[0] 0 info (trace) > traced_buf_padding_bytes_written[1] 0 info (trace) > traced_buf_patches_failed[0] 0 data_loss (trace) > traced_buf_patches_failed[1] 0 data_loss (trace) > traced_buf_patches_succeeded[0] 5633 info (trace) > traced_buf_patches_succeeded[1] 8 info (trace) > traced_buf_readaheads_failed[0] 115 info (trace) > traced_buf_readaheads_failed[1] 18 info (trace) > traced_buf_readaheads_succeeded[0] 2257 info (trace) > traced_buf_readaheads_succeeded[1] 6 info (trace) > traced_buf_trace_writer_packet_loss[0] 0 data_loss (trace) > traced_buf_trace_writer_packet_loss[1] 0 data_loss (trace) > traced_buf_write_wrap_count[0] 0 info (trace) > traced_buf_write_wrap_count[1] 0 info (trace) > traced_chunks_discarded 0 info (trace) > traced_data_sources_registered 16 info (trace) > traced_data_sources_seen 6 info (trace) > traced_final_flush_failed 0 data_loss (trace) > traced_final_flush_succeeded 0 info (trace) > traced_flushes_failed 0 data_loss (trace) > traced_flushes_requested 0 info (trace) > traced_flushes_succeeded 0 info (trace) > traced_patches_discarded 0 info (trace) > traced_producers_connected 3 info (trace) > traced_producers_seen 3 info (trace) > traced_total_buffers 2 info (trace) > traced_tracing_sessions 1 info (trace) > track_event_dropped_packets_outside_of_range_of_interesthelp_outline 0 info (analysis) > track_event_parser_errors 0 info (analysis) > track_event_thread_invalid_endhelp_outline 0 error (trace) > track_event_tokenizer_errors 0 info (analysis) > truncated_sys_write_durationhelp_outline 0 data_loss (analysis) > unknown_extension_fieldshelp_outline 0 error (trace) > vmstat_unknown_keys 0 error (analysis) > vulkan_allocations_invalid_string_id 0 error (trace) > > > -- > > Regards > > Yafang > >
On Thu, Feb 09, 2023 at 11:37:44PM +0800, Yafang Shao wrote: > > > It goes from "not working at all" to "mostly working but missing data" > > compared to what happens if I just revert 3087c61ed2c48548b74dd343a5209b87082c682d. > > > > Do you mean there are no errors at all if revert > 3087c61ed2c48548b74dd343a5209b87082c682d ? Correct yes, the revert makes it work perfectly. > > I'm just an end user so can't really speak to the underlying causes but > > for those more familiar with how Perfetto works this is what I'm getting: > > > > The sched_switch tracepoint format file has the same output with > reverting the commit, > > $ cat /sys/kernel/debug/tracing/events/sched/sched_switch/format > name: sched_switch > ID: 286 > format: > field:unsigned short common_type; offset:0; size:2; signed:0; > field:unsigned char common_flags; offset:2; size:1; signed:0; > field:unsigned char common_preempt_count; offset:3; size:1; signed:0; > field:int common_pid; offset:4; size:4; signed:1; > field:char prev_comm[16]; offset:8; size:16; signed:0; > field:pid_t prev_pid; offset:24; size:4; signed:1; > field:int prev_prio; offset:28; size:4; signed:1; > field:long prev_state; offset:32; size:8; signed:1; > field:char next_comm[16]; offset:40; size:16; signed:0; > field:pid_t next_pid; offset:56; size:4; signed:1; > field:int next_prio; offset:60; size:4; signed:1; > > print fmt: "prev_comm=%s prev_pid=%d prev_prio=%d prev_state=%s%s ==> > next_comm=%s next_pid=%d next_prio=%d", REC->prev_comm, REC->prev_pid, > REC->prev_prio, (REC->prev_state & ((((0x00000000 | 0x00000001 | > 0x00000002 | 0x00000004 | 0x00000008 | 0x00000010 | 0x00000020 | > 0x00000040) + 1) << 1) - 1)) ? __print_flags(REC->prev_state & > ((((0x00000000 | 0x00000001 | 0x00000002 | 0x00000004 | 0x00000008 | > 0x00000010 | 0x00000020 | 0x00000040) + 1) << 1) - 1), "|", { > 0x00000001, "S" }, { 0x00000002, "D" }, { 0x00000004, "T" }, { > 0x00000008, "t" }, { 0x00000010, "X" }, { 0x00000020, "Z" }, { > 0x00000040, "P" }, { 0x00000080, "I" }) : "R", REC->prev_state & > (((0x00000000 | 0x00000001 | 0x00000002 | 0x00000004 | 0x00000008 | > 0x00000010 | 0x00000020 | 0x00000040) + 1) << 1) ? "+" : "", > REC->next_comm, REC->next_pid, REC->next_prio > > So may be these errors were caused by other issues ? As I said not really sure why it's happening but there's definitely an issue somewhere. I'm hoping someone more familiar with how Perfetto works might have an idea. > > Error stats for this trace: > > name idx source value > > ---------------------------------------- ---------------------------------------- ---------------------------------------- ---------------------------------------- > > mismatched_sched_switch_tids [NULL] analysis 11101 > > systrace_parse_failure [NULL] analysis 19040 > > > > The trace explorer window ends up containing the ftrace-specific tracks > > but missing the tracks related to Android-specific callbacks and such. > > > > Debug stats below in case they're relevant: > > > > Name Value Type > > android_br_parse_errors 0 error (trace) > > android_log_format_invalid 0 error (trace) > > android_log_num_failed 0 error (trace) > > android_log_num_skipped 0 info (trace) > > android_log_num_total 0 info (trace) > > clock_sync_cache_miss 181 info (analysis) > > clock_sync_failure 0 error (analysis) > > compact_sched_has_parse_errors 0 error (trace) > > compact_sched_switch_skipped 0 info (analysis) > > compact_sched_waking_skipped 0 info (analysis) > > counter_events_out_of_order 0 error (analysis) > > deobfuscate_location_parse_error 0 error (trace) > > empty_chrome_metadata 0 error (trace) > > energy_breakdown_missing_values 0 error (analysis) > > energy_descriptor_invalid 0 error (analysis) > > energy_uid_breakdown_missing_values 0 error (analysis) > > flow_duplicate_id 0 error (trace) > > flow_end_without_start 0 info (trace) > > flow_invalid_id 0 error (trace) > > flow_no_enclosing_slice 0 error (trace) > > flow_step_without_start 0 info (trace) > > flow_without_direction 0 error (trace) > > frame_timeline_event_parser_errors 0 info (analysis) > > ftrace_bundle_tokenizer_errors 0 error (analysis) > > ftrace_cpu_bytes_read_begin[0] 0 info (trace) > > ftrace_cpu_bytes_read_begin[1] 264 info (trace) > > ftrace_cpu_bytes_read_begin[2] 0 info (trace) > > ftrace_cpu_bytes_read_begin[3] 224 info (trace) > > ftrace_cpu_bytes_read_begin[4] 0 info (trace) > > ftrace_cpu_bytes_read_begin[5] 0 info (trace) > > ftrace_cpu_bytes_read_begin[6] 0 info (trace) > > ftrace_cpu_bytes_read_begin[7] 0 info (trace) > > ftrace_cpu_bytes_read_delta[0] 6919836 info (trace) > > ftrace_cpu_bytes_read_delta[1] 7197556 info (trace) > > ftrace_cpu_bytes_read_delta[2] 6381828 info (trace) > > ftrace_cpu_bytes_read_delta[3] 5988336 info (trace) > > ftrace_cpu_bytes_read_delta[4] 5933528 info (trace) > > ftrace_cpu_bytes_read_delta[5] 4858400 info (trace) > > ftrace_cpu_bytes_read_delta[6] 6175260 info (trace) > > ftrace_cpu_bytes_read_delta[7] 4633460 info (trace) > > ftrace_cpu_bytes_read_end[0] 6919836 info (trace) > > ftrace_cpu_bytes_read_end[1] 7197820 info (trace) > > ftrace_cpu_bytes_read_end[2] 6381828 info (trace) > > ftrace_cpu_bytes_read_end[3] 5988560 info (trace) > > ftrace_cpu_bytes_read_end[4] 5933528 info (trace) > > ftrace_cpu_bytes_read_end[5] 4858400 info (trace) > > ftrace_cpu_bytes_read_end[6] 6175260 info (trace) > > ftrace_cpu_bytes_read_end[7] 4633460 info (trace) > > ftrace_cpu_commit_overrun_begin[0] 0 info (trace) > > ftrace_cpu_commit_overrun_begin[1] 0 info (trace) > > ftrace_cpu_commit_overrun_begin[2] 0 info (trace) > > ftrace_cpu_commit_overrun_begin[3] 0 info (trace) > > ftrace_cpu_commit_overrun_begin[4] 0 info (trace) > > ftrace_cpu_commit_overrun_begin[5] 0 info (trace) > > ftrace_cpu_commit_overrun_begin[6] 0 info (trace) > > ftrace_cpu_commit_overrun_begin[7] 0 info (trace) > > ftrace_cpu_commit_overrun_delta[0] 0 error (trace) > > ftrace_cpu_commit_overrun_delta[1] 0 error (trace) > > ftrace_cpu_commit_overrun_delta[2] 0 error (trace) > > ftrace_cpu_commit_overrun_delta[3] 0 error (trace) > > ftrace_cpu_commit_overrun_delta[4] 0 error (trace) > > ftrace_cpu_commit_overrun_delta[5] 0 error (trace) > > ftrace_cpu_commit_overrun_delta[6] 0 error (trace) > > ftrace_cpu_commit_overrun_delta[7] 0 error (trace) > > ftrace_cpu_commit_overrun_end[0] 0 info (trace) > > ftrace_cpu_commit_overrun_end[1] 0 info (trace) > > ftrace_cpu_commit_overrun_end[2] 0 info (trace) > > ftrace_cpu_commit_overrun_end[3] 0 info (trace) > > ftrace_cpu_commit_overrun_end[4] 0 info (trace) > > ftrace_cpu_commit_overrun_end[5] 0 info (trace) > > ftrace_cpu_commit_overrun_end[6] 0 info (trace) > > ftrace_cpu_commit_overrun_end[7] 0 info (trace) > > ftrace_cpu_dropped_events_begin[0] 0 info (trace) > > ftrace_cpu_dropped_events_begin[1] 0 info (trace) > > ftrace_cpu_dropped_events_begin[2] 0 info (trace) > > ftrace_cpu_dropped_events_begin[3] 0 info (trace) > > ftrace_cpu_dropped_events_begin[4] 0 info (trace) > > ftrace_cpu_dropped_events_begin[5] 0 info (trace) > > ftrace_cpu_dropped_events_begin[6] 0 info (trace) > > ftrace_cpu_dropped_events_begin[7] 0 info (trace) > > ftrace_cpu_dropped_events_delta[0] 0 error (trace) > > ftrace_cpu_dropped_events_delta[1] 0 error (trace) > > ftrace_cpu_dropped_events_delta[2] 0 error (trace) > > ftrace_cpu_dropped_events_delta[3] 0 error (trace) > > ftrace_cpu_dropped_events_delta[4] 0 error (trace) > > ftrace_cpu_dropped_events_delta[5] 0 error (trace) > > ftrace_cpu_dropped_events_delta[6] 0 error (trace) > > ftrace_cpu_dropped_events_delta[7] 0 error (trace) > > ftrace_cpu_dropped_events_end[0] 0 info (trace) > > ftrace_cpu_dropped_events_end[1] 0 info (trace) > > ftrace_cpu_dropped_events_end[2] 0 info (trace) > > ftrace_cpu_dropped_events_end[3] 0 info (trace) > > ftrace_cpu_dropped_events_end[4] 0 info (trace) > > ftrace_cpu_dropped_events_end[5] 0 info (trace) > > ftrace_cpu_dropped_events_end[6] 0 info (trace) > > ftrace_cpu_dropped_events_end[7] 0 info (trace) > > ftrace_cpu_entries_begin[0] 0 info (trace) > > ftrace_cpu_entries_begin[1] 6 info (trace) > > ftrace_cpu_entries_begin[2] 0 info (trace) > > ftrace_cpu_entries_begin[3] 5 info (trace) > > ftrace_cpu_entries_begin[4] 0 info (trace) > > ftrace_cpu_entries_begin[5] 0 info (trace) > > ftrace_cpu_entries_begin[6] 0 info (trace) > > ftrace_cpu_entries_begin[7] 0 info (trace) > > ftrace_cpu_entries_delta[0] 6 info (trace) > > ftrace_cpu_entries_delta[1] -6 info (trace) > > ftrace_cpu_entries_delta[2] 0 info (trace) > > ftrace_cpu_entries_delta[3] 2 info (trace) > > ftrace_cpu_entries_delta[4] 0 info (trace) > > ftrace_cpu_entries_delta[5] 0 info (trace) > > ftrace_cpu_entries_delta[6] 0 info (trace) > > ftrace_cpu_entries_delta[7] 0 info (trace) > > ftrace_cpu_entries_end[0] 6 info (trace) > > ftrace_cpu_entries_end[1] 0 info (trace) > > ftrace_cpu_entries_end[2] 0 info (trace) > > ftrace_cpu_entries_end[3] 7 info (trace) > > ftrace_cpu_entries_end[4] 0 info (trace) > > ftrace_cpu_entries_end[5] 0 info (trace) > > ftrace_cpu_entries_end[6] 0 info (trace) > > ftrace_cpu_entries_end[7] 0 info (trace) > > ftrace_cpu_now_ts_begin[0] 93305027000 info (trace) > > ftrace_cpu_now_ts_begin[1] 93305103000 info (trace) > > ftrace_cpu_now_ts_begin[2] 93305159000 info (trace) > > ftrace_cpu_now_ts_begin[3] 93305207000 info (trace) > > ftrace_cpu_now_ts_begin[4] 93305262000 info (trace) > > ftrace_cpu_now_ts_begin[5] 93305312000 info (trace) > > ftrace_cpu_now_ts_begin[6] 93305362000 info (trace) > > ftrace_cpu_now_ts_begin[7] 93305411000 info (trace) > > ftrace_cpu_now_ts_end[0] 282906571000 info (trace) > > ftrace_cpu_now_ts_end[1] 282906676000 info (trace) > > ftrace_cpu_now_ts_end[2] 282906738000 info (trace) > > ftrace_cpu_now_ts_end[3] 282906803000 info (trace) > > ftrace_cpu_now_ts_end[4] 282906863000 info (trace) > > ftrace_cpu_now_ts_end[5] 282906925000 info (trace) > > ftrace_cpu_now_ts_end[6] 282906987000 info (trace) > > ftrace_cpu_now_ts_end[7] 282907048000 info (trace) > > ftrace_cpu_oldest_event_ts_begin[0] 0 info (trace) > > ftrace_cpu_oldest_event_ts_begin[1] 93304642000 info (trace) > > ftrace_cpu_oldest_event_ts_begin[2] 0 info (trace) > > ftrace_cpu_oldest_event_ts_begin[3] 93304876000 info (trace) > > ftrace_cpu_oldest_event_ts_begin[4] 0 info (trace) > > ftrace_cpu_oldest_event_ts_begin[5] 0 info (trace) > > ftrace_cpu_oldest_event_ts_begin[6] 0 info (trace) > > ftrace_cpu_oldest_event_ts_begin[7] 0 info (trace) > > ftrace_cpu_oldest_event_ts_end[0] 282905715000 info (trace) > > ftrace_cpu_oldest_event_ts_end[1] 282903723000 info (trace) > > ftrace_cpu_oldest_event_ts_end[2] 282903881000 info (trace) > > ftrace_cpu_oldest_event_ts_end[3] 282816175000 info (trace) > > ftrace_cpu_oldest_event_ts_end[4] 282896619000 info (trace) > > ftrace_cpu_oldest_event_ts_end[5] 282884168000 info (trace) > > ftrace_cpu_oldest_event_ts_end[6] 282783221000 info (trace) > > ftrace_cpu_oldest_event_ts_end[7] 282880081000 info (trace) > > ftrace_cpu_overrun_begin[0] 0 info (trace) > > ftrace_cpu_overrun_begin[1] 0 info (trace) > > ftrace_cpu_overrun_begin[2] 0 info (trace) > > ftrace_cpu_overrun_begin[3] 0 info (trace) > > ftrace_cpu_overrun_begin[4] 0 info (trace) > > ftrace_cpu_overrun_begin[5] 0 info (trace) > > ftrace_cpu_overrun_begin[6] 0 info (trace) > > ftrace_cpu_overrun_begin[7] 0 info (trace) > > ftrace_cpu_overrun_delta[0]help_outline 0 data_loss (trace) > > ftrace_cpu_overrun_delta[1]help_outline 0 data_loss (trace) > > ftrace_cpu_overrun_delta[2]help_outline 0 data_loss (trace) > > ftrace_cpu_overrun_delta[3]help_outline 0 data_loss (trace) > > ftrace_cpu_overrun_delta[4]help_outline 0 data_loss (trace) > > ftrace_cpu_overrun_delta[5]help_outline 0 data_loss (trace) > > ftrace_cpu_overrun_delta[6]help_outline 0 data_loss (trace) > > ftrace_cpu_overrun_delta[7]help_outline 0 data_loss (trace) > > ftrace_cpu_overrun_end[0] 0 info (trace) > > ftrace_cpu_overrun_end[1] 0 info (trace) > > ftrace_cpu_overrun_end[2] 0 info (trace) > > ftrace_cpu_overrun_end[3] 0 info (trace) > > ftrace_cpu_overrun_end[4] 0 info (trace) > > ftrace_cpu_overrun_end[5] 0 info (trace) > > ftrace_cpu_overrun_end[6] 0 info (trace) > > ftrace_cpu_overrun_end[7] 0 info (trace) > > ftrace_cpu_read_events_begin[0] 0 info (trace) > > ftrace_cpu_read_events_begin[1] 0 info (trace) > > ftrace_cpu_read_events_begin[2] 0 info (trace) > > ftrace_cpu_read_events_begin[3] 0 info (trace) > > ftrace_cpu_read_events_begin[4] 0 info (trace) > > ftrace_cpu_read_events_begin[5] 0 info (trace) > > ftrace_cpu_read_events_begin[6] 0 info (trace) > > ftrace_cpu_read_events_begin[7] 0 info (trace) > > ftrace_cpu_read_events_delta[0] 454848 info (trace) > > ftrace_cpu_read_events_delta[1] 453484 info (trace) > > ftrace_cpu_read_events_delta[2] 386290 info (trace) > > ftrace_cpu_read_events_delta[3] 356432 info (trace) > > ftrace_cpu_read_events_delta[4] 393337 info (trace) > > ftrace_cpu_read_events_delta[5] 325244 info (trace) > > ftrace_cpu_read_events_delta[6] 392637 info (trace) > > ftrace_cpu_read_events_delta[7] 350623 info (trace) > > ftrace_cpu_read_events_end[0] 454848 info (trace) > > ftrace_cpu_read_events_end[1] 453484 info (trace) > > ftrace_cpu_read_events_end[2] 386290 info (trace) > > ftrace_cpu_read_events_end[3] 356432 info (trace) > > ftrace_cpu_read_events_end[4] 393337 info (trace) > > ftrace_cpu_read_events_end[5] 325244 info (trace) > > ftrace_cpu_read_events_end[6] 392637 info (trace) > > ftrace_cpu_read_events_end[7] 350623 info (trace) > > ftrace_packet_before_tracing_starthelp_outline 0 info (analysis) > > ftrace_setup_errorshelp_outline 0 error (trace) > > fuchsia_invalid_event 0 error (analysis) > > fuchsia_non_numeric_counters 0 error (analysis) > > fuchsia_timestamp_overflow 0 error (analysis) > > game_intervention_has_parse_errorshelp_outline 0 error (trace) > > game_intervention_has_read_errorshelp_outline 0 error (trace) > > gpu_counters_invalid_spec 0 error (analysis) > > gpu_counters_missing_spec 0 error (analysis) > > gpu_render_stage_parser_errors 0 error (analysis) > > graphics_frame_event_parser_errors 0 info (analysis) > > guess_trace_type_duration_ns 7654 info (analysis) > > heap_graph_non_finalized_graph 0 error (trace) > > heapprofd_missing_packet 0 error (trace) > > heapprofd_non_finalized_profile 0 error (trace) > > interned_data_tokenizer_errors 0 info (analysis) > > invalid_clock_snapshots 0 error (analysis) > > invalid_cpu_times 0 error (analysis) > > json_display_time_unithelp_outline 0 info (trace) > > json_parser_failure 0 error (trace) > > json_tokenizer_failure 0 error (trace) > > meminfo_unknown_keys 0 error (analysis) > > memory_snapshot_parser_failure 0 error (analysis) > > metatrace_overruns 0 error (trace) > > mismatched_sched_switch_tids 11101 error (analysis) > > misplaced_end_event 0 data_loss (analysis) > > mm_unknown_type 0 error (analysis) > > ninja_parse_errors 0 error (trace) > > packages_list_has_parse_errors 0 error (trace) > > packages_list_has_read_errors 0 error (trace) > > parse_trace_duration_ns 1780589548 info (analysis) > > perf_samples_skipped 0 info (trace) > > perf_samples_skipped_dataloss 0 data_loss (trace) > > power_rail_unknown_index 0 error (trace) > > proc_stat_unknown_counters 0 error (analysis) > > process_tracker_errors 0 error (analysis) > > rss_stat_negative_size 0 info (analysis) > > rss_stat_unknown_keys 0 error (analysis) > > rss_stat_unknown_thread_for_mm_id 0 info (analysis) > > sched_switch_out_of_order 0 error (analysis) > > sched_waking_out_of_order 0 error (analysis) > > slice_out_of_order 0 error (analysis) > > sorter_push_event_out_of_orderhelp_outline 0 error (trace) > > stackprofile_invalid_callstack_id 0 error (trace) > > stackprofile_invalid_frame_id 0 error (trace) > > stackprofile_invalid_mapping_id 0 error (trace) > > stackprofile_invalid_string_id 0 error (trace) > > stackprofile_parser_error 0 error (trace) > > symbolization_tmp_build_id_not_foundhelp_outline 0 error (analysis) > > systrace_parse_failure 19040 error (analysis) > > task_state_invalid 0 error (analysis) > > thread_time_in_state_out_of_order 0 error (analysis) > > thread_time_in_state_unknown_cpu_freq 0 error (analysis) > > tokenizer_skipped_packets 0 info (analysis) > > traced_buf_abi_violations[0] 0 data_loss (trace) > > traced_buf_abi_violations[1] 0 data_loss (trace) > > traced_buf_buffer_size[0] 534773760 info (trace) > > traced_buf_buffer_size[1] 2097152 info (trace) > > traced_buf_bytes_overwritten[0] 0 info (trace) > > traced_buf_bytes_overwritten[1] 0 info (trace) > > traced_buf_bytes_read[0] 78929920 info (trace) > > traced_buf_bytes_read[1] 425984 info (trace) > > traced_buf_bytes_written[0] 78962688 info (trace) > > traced_buf_bytes_written[1] 425984 info (trace) > > traced_buf_chunks_committed_out_of_order[0] 0 info (trace) > > traced_buf_chunks_committed_out_of_order[1] 0 info (trace) > > traced_buf_chunks_discarded[0] 0 info (trace) > > traced_buf_chunks_discarded[1] 0 info (trace) > > traced_buf_chunks_overwritten[0] 0 info (trace) > > traced_buf_chunks_overwritten[1] 0 info (trace) > > traced_buf_chunks_read[0] 2428 info (trace) > > traced_buf_chunks_read[1] 13 info (trace) > > traced_buf_chunks_rewritten[0] 6 info (trace) > > traced_buf_chunks_rewritten[1] 0 info (trace) > > traced_buf_chunks_written[0] 2429 info (trace) > > traced_buf_chunks_written[1] 13 info (trace) > > traced_buf_padding_bytes_cleared[0] 0 info (trace) > > traced_buf_padding_bytes_cleared[1] 0 info (trace) > > traced_buf_padding_bytes_written[0] 0 info (trace) > > traced_buf_padding_bytes_written[1] 0 info (trace) > > traced_buf_patches_failed[0] 0 data_loss (trace) > > traced_buf_patches_failed[1] 0 data_loss (trace) > > traced_buf_patches_succeeded[0] 5633 info (trace) > > traced_buf_patches_succeeded[1] 8 info (trace) > > traced_buf_readaheads_failed[0] 115 info (trace) > > traced_buf_readaheads_failed[1] 18 info (trace) > > traced_buf_readaheads_succeeded[0] 2257 info (trace) > > traced_buf_readaheads_succeeded[1] 6 info (trace) > > traced_buf_trace_writer_packet_loss[0] 0 data_loss (trace) > > traced_buf_trace_writer_packet_loss[1] 0 data_loss (trace) > > traced_buf_write_wrap_count[0] 0 info (trace) > > traced_buf_write_wrap_count[1] 0 info (trace) > > traced_chunks_discarded 0 info (trace) > > traced_data_sources_registered 16 info (trace) > > traced_data_sources_seen 6 info (trace) > > traced_final_flush_failed 0 data_loss (trace) > > traced_final_flush_succeeded 0 info (trace) > > traced_flushes_failed 0 data_loss (trace) > > traced_flushes_requested 0 info (trace) > > traced_flushes_succeeded 0 info (trace) > > traced_patches_discarded 0 info (trace) > > traced_producers_connected 3 info (trace) > > traced_producers_seen 3 info (trace) > > traced_total_buffers 2 info (trace) > > traced_tracing_sessions 1 info (trace) > > track_event_dropped_packets_outside_of_range_of_interesthelp_outline 0 info (analysis) > > track_event_parser_errors 0 info (analysis) > > track_event_thread_invalid_endhelp_outline 0 error (trace) > > track_event_tokenizer_errors 0 info (analysis) > > truncated_sys_write_durationhelp_outline 0 data_loss (analysis) > > unknown_extension_fieldshelp_outline 0 error (trace) > > vmstat_unknown_keys 0 error (analysis) > > vulkan_allocations_invalid_string_id 0 error (trace) > > > > > -- > > > Regards > > > Yafang > > > > > > > -- > Regards > Yafang
On 02/09/23 23:37, Yafang Shao wrote: > On Thu, Feb 9, 2023 at 10:28 PM Kajetan Puchalski > <kajetan.puchalski@arm.com> wrote: > > > > On Thu, Feb 09, 2023 at 02:20:36PM +0800, Yafang Shao wrote: > > > > [...] > > > > Hi Yafang, > > > > > Many thanks for the detailed analysis. Seems it can work. > > > > > > Hi John, > > > > > > Could you pls. try the attached fix ? I have verified it in my test env. > > > > I tested the patch on my environment where I found the issue with newer > > kernels + older Perfetto. The patch does improve things so that's nice. > > Thanks for the test. I don't have Perfetto in hand, so I haven't > verify Perfetto. FWIW, perfetto is not android specific and can run on normal linux distro setup (which I do but haven't noticed this breakage). It's easy to download the latest release (including for android though I never tried that) from github https://github.com/google/perfetto/releases Kajetan might try to see if he can pick the latest version which IIUC contains a workaround. If this simple patch can be tweaked to make it work again against older versions that'd be nice though. HTH. Cheers -- Qais Yousef
On Wed, 8 Feb 2023 21:33:43 -0500 Steven Rostedt <rostedt@goodmis.org> wrote: > OK, so it doesn't break perf, trace-cmd and rasdaemon, because the enum is > only needed in the print_fmt part. It can handle it in the field portion. > > That is: > > > system: sched > name: sched_switch > ID: 285 > format: > field:unsigned short common_type; offset:0; size:2; signed:0; > field:unsigned char common_flags; offset:2; size:1; signed:0; > field:unsigned char common_preempt_count; offset:3; size:1; signed:0; > field:int common_pid; offset:4; size:4; signed:1; > > field:char prev_comm[TASK_COMM_LEN]; offset:8; size:16; signed:0; > ^^^^^^^^^^^^^^ ^^ > is ignored is used > > > field:pid_t prev_pid; offset:24; size:4; signed:1; > field:int prev_prio; offset:28; size:4; signed:1; > field:long prev_state; offset:32; size:8; signed:1; > field:char next_comm[TASK_COMM_LEN]; offset:40; size:16; signed:0; > field:pid_t next_pid; offset:56; size:4; signed:1; > field:int next_prio; offset:60; size:4; signed:1; > > print fmt: "prev_comm=%s prev_pid=%d prev_prio=%d prev_state=%s%s ==> next_comm=%s next_pid=%d next_prio=%d", REC->prev_comm, REC->prev_pid, REC->prev_prio, (REC->prev_state & ((((0x00000000 | 0x00000001 | 0x00000002 | 0x00000004 | 0x00000008 | 0x00000010 | 0x00000020 | 0x00000040) + 1) << 1) - 1)) ? __print_flags(REC->prev_state & ((((0x00000000 | 0x00000001 | 0x00000002 | 0x00000004 | 0x00000008 | 0x00000010 | 0x00000020 | 0x00000040) + 1) << 1) - 1), "|", { 0x00000001, "S" }, { 0x00000002, "D" }, { 0x00000004, "T" }, { 0x00000008, "t" }, { 0x00000010, "X" }, { 0x00000020, "Z" }, { 0x00000040, "P" }, { 0x00000080, "I" }) : "R", REC->prev_state & (((0x00000000 | 0x00000001 | 0x00000002 | 0x00000004 | 0x00000008 | 0x00000010 | 0x00000020 | 0x00000040) + 1) << 1) ? "+" : "", REC->next_comm, REC->next_pid, REC->next_prio > > ^^^^^^^ > > Is what requires the conversions. So I take that back. It only breaks > perfetto, and that's because it writes its own parser and doesn't use > libtraceevent. Actually, there are cases that this needs to be a number, as b3bc8547d3be6 ("tracing: Have TRACE_DEFINE_ENUM affect trace event types as well") made it update fields as well as the printk fmt. I think because libtraceevent noticed that it was a "char" array, it just defaults to "size". But this does have meaning for all other types, and I can see other parsers requiring that. -- Steve
On Sun, Feb 12, 2023 at 12:51 AM Qais Yousef <qyousef@layalina.io> wrote: > > On 02/09/23 23:37, Yafang Shao wrote: > > On Thu, Feb 9, 2023 at 10:28 PM Kajetan Puchalski > > <kajetan.puchalski@arm.com> wrote: > > > > > > On Thu, Feb 09, 2023 at 02:20:36PM +0800, Yafang Shao wrote: > > > > > > [...] > > > > > > Hi Yafang, > > > > > > > Many thanks for the detailed analysis. Seems it can work. > > > > > > > > Hi John, > > > > > > > > Could you pls. try the attached fix ? I have verified it in my test env. > > > > > > I tested the patch on my environment where I found the issue with newer > > > kernels + older Perfetto. The patch does improve things so that's nice. > > > > Thanks for the test. I don't have Perfetto in hand, so I haven't > > verify Perfetto. > > FWIW, perfetto is not android specific and can run on normal linux distro setup > (which I do but haven't noticed this breakage). > > It's easy to download the latest release (including for android though I never > tried that) from github > > https://github.com/google/perfetto/releases > Thanks for the information. I will try to run it on my test env. I suspect the "systrace_parse_failure" error is caused by the field we introduced into struct ftrace_event_field in the proposed patch, but I haven't taken a deep look at the perfetto src code yet. > Kajetan might try to see if he can pick the latest version which IIUC contains > a workaround. > > If this simple patch can be tweaked to make it work again against older > versions that'd be nice though. > > HTH. > > > Cheers > > -- > Qais Yousef
On Sun, Feb 12, 2023 at 3:00 AM Steven Rostedt <rostedt@goodmis.org> wrote: > > On Wed, 8 Feb 2023 21:33:43 -0500 > Steven Rostedt <rostedt@goodmis.org> wrote: > > > OK, so it doesn't break perf, trace-cmd and rasdaemon, because the enum is > > only needed in the print_fmt part. It can handle it in the field portion. > > > > That is: > > > > > > system: sched > > name: sched_switch > > ID: 285 > > format: > > field:unsigned short common_type; offset:0; size:2; signed:0; > > field:unsigned char common_flags; offset:2; size:1; signed:0; > > field:unsigned char common_preempt_count; offset:3; size:1; signed:0; > > field:int common_pid; offset:4; size:4; signed:1; > > > > field:char prev_comm[TASK_COMM_LEN]; offset:8; size:16; signed:0; > > ^^^^^^^^^^^^^^ ^^ > > is ignored is used > > > > > > field:pid_t prev_pid; offset:24; size:4; signed:1; > > field:int prev_prio; offset:28; size:4; signed:1; > > field:long prev_state; offset:32; size:8; signed:1; > > field:char next_comm[TASK_COMM_LEN]; offset:40; size:16; signed:0; > > field:pid_t next_pid; offset:56; size:4; signed:1; > > field:int next_prio; offset:60; size:4; signed:1; > > > > print fmt: "prev_comm=%s prev_pid=%d prev_prio=%d prev_state=%s%s ==> next_comm=%s next_pid=%d next_prio=%d", REC->prev_comm, REC->prev_pid, REC->prev_prio, (REC->prev_state & ((((0x00000000 | 0x00000001 | 0x00000002 | 0x00000004 | 0x00000008 | 0x00000010 | 0x00000020 | 0x00000040) + 1) << 1) - 1)) ? __print_flags(REC->prev_state & ((((0x00000000 | 0x00000001 | 0x00000002 | 0x00000004 | 0x00000008 | 0x00000010 | 0x00000020 | 0x00000040) + 1) << 1) - 1), "|", { 0x00000001, "S" }, { 0x00000002, "D" }, { 0x00000004, "T" }, { 0x00000008, "t" }, { 0x00000010, "X" }, { 0x00000020, "Z" }, { 0x00000040, "P" }, { 0x00000080, "I" }) : "R", REC->prev_state & (((0x00000000 | 0x00000001 | 0x00000002 | 0x00000004 | 0x00000008 | 0x00000010 | 0x00000020 | 0x00000040) + 1) << 1) ? "+" : "", REC->next_comm, REC->next_pid, REC->next_prio > > > > ^^^^^^^ > > > > Is what requires the conversions. So I take that back. It only breaks > > perfetto, and that's because it writes its own parser and doesn't use > > libtraceevent. > > Actually, there are cases that this needs to be a number, as b3bc8547d3be6 > ("tracing: Have TRACE_DEFINE_ENUM affect trace event types as well") made > it update fields as well as the printk fmt. > It seems that TRACE_DEFINE_ENUM(TASK_COMM_LEN) in the trace events header files would be a better fix. > I think because libtraceevent noticed that it was a "char" array, it just > defaults to "size". But this does have meaning for all other types, and I > can see other parsers requiring that. > > -- Steve
On Sun, 12 Feb 2023 11:38:52 +0800 Yafang Shao <laoar.shao@gmail.com> wrote: > > Actually, there are cases that this needs to be a number, as b3bc8547d3be6 > > ("tracing: Have TRACE_DEFINE_ENUM affect trace event types as well") made > > it update fields as well as the printk fmt. > > > > It seems that TRACE_DEFINE_ENUM(TASK_COMM_LEN) in the trace events > header files would be a better fix. NACK! I much prefer the proper fix that adds the length. -- Steve
Hi Steve, On Sat, Feb 11, 2023 at 8:07 PM Steven Rostedt <rostedt@goodmis.org> wrote: > > On Sun, 12 Feb 2023 11:38:52 +0800 > Yafang Shao <laoar.shao@gmail.com> wrote: > > > > Actually, there are cases that this needs to be a number, as b3bc8547d3be6 > > > ("tracing: Have TRACE_DEFINE_ENUM affect trace event types as well") made > > > it update fields as well as the printk fmt. > > > > > > > It seems that TRACE_DEFINE_ENUM(TASK_COMM_LEN) in the trace events > > header files would be a better fix. > > NACK! I much prefer the proper fix that adds the length. Can we just have both enum and macro at the same time? I guess the enum would fill the BTF and the macro would provide backward compatibility. Thanks, Namhyung
On 2023-02-13 12:43, Namhyung Kim wrote: > Hi Steve, > > On Sat, Feb 11, 2023 at 8:07 PM Steven Rostedt <rostedt@goodmis.org> wrote: >> >> On Sun, 12 Feb 2023 11:38:52 +0800 >> Yafang Shao <laoar.shao@gmail.com> wrote: >> >>>> Actually, there are cases that this needs to be a number, as b3bc8547d3be6 >>>> ("tracing: Have TRACE_DEFINE_ENUM affect trace event types as well") made >>>> it update fields as well as the printk fmt. >>>> >>> >>> It seems that TRACE_DEFINE_ENUM(TASK_COMM_LEN) in the trace events >>> header files would be a better fix. >> >> NACK! I much prefer the proper fix that adds the length. > > Can we just have both enum and macro at the same time? > I guess the enum would fill the BTF and the macro would provide > backward compatibility. This is no need to keep the define. The root cause of the issue is addressed by this fix: https://lore.kernel.org/lkml/20230212154620.4d8fe033@gandalf.local.home/ Thanks, Mathieu > > Thanks, > Namhyung
diff --git a/include/linux/sched.h b/include/linux/sched.h index 78c351e35fec..cecd4806edc6 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -274,8 +274,13 @@ struct task_group; #define get_current_state() READ_ONCE(current->__state) -/* Task command name length: */ -#define TASK_COMM_LEN 16 +/* + * Define the task command name length as enum, then it can be visible to + * BPF programs. + */ +enum { + TASK_COMM_LEN = 16, +}; extern void scheduler_tick(void); diff --git a/tools/testing/selftests/bpf/progs/test_stacktrace_map.c b/tools/testing/selftests/bpf/progs/test_stacktrace_map.c index a8233e7f173b..728dbd39eff0 100644 --- a/tools/testing/selftests/bpf/progs/test_stacktrace_map.c +++ b/tools/testing/selftests/bpf/progs/test_stacktrace_map.c @@ -1,7 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 // Copyright (c) 2018 Facebook -#include <linux/bpf.h> +#include <vmlinux.h> #include <bpf/bpf_helpers.h> #ifndef PERF_MAX_STACK_DEPTH @@ -41,11 +41,11 @@ struct { /* taken from /sys/kernel/debug/tracing/events/sched/sched_switch/format */ struct sched_switch_args { unsigned long long pad; - char prev_comm[16]; + char prev_comm[TASK_COMM_LEN]; int prev_pid; int prev_prio; long long prev_state; - char next_comm[16]; + char next_comm[TASK_COMM_LEN]; int next_pid; int next_prio; }; diff --git a/tools/testing/selftests/bpf/progs/test_tracepoint.c b/tools/testing/selftests/bpf/progs/test_tracepoint.c index ce6974016f53..43bd7a20cc50 100644 --- a/tools/testing/selftests/bpf/progs/test_tracepoint.c +++ b/tools/testing/selftests/bpf/progs/test_tracepoint.c @@ -1,17 +1,17 @@ // SPDX-License-Identifier: GPL-2.0 // Copyright (c) 2017 Facebook -#include <linux/bpf.h> +#include <vmlinux.h> #include <bpf/bpf_helpers.h> /* taken from /sys/kernel/debug/tracing/events/sched/sched_switch/format */ struct sched_switch_args { unsigned long long pad; - char prev_comm[16]; + char prev_comm[TASK_COMM_LEN]; int prev_pid; int prev_prio; long long prev_state; - char next_comm[16]; + char next_comm[TASK_COMM_LEN]; int next_pid; int next_prio; };