Message ID | 20200626162706.3304357-4-dinechin@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | trace: Add a trace backend for the recorder library | expand |
On Fri, Jun 26, 2020 at 06:27:06PM +0200, Christophe de Dinechin wrote: > This is an example showing how the recorder can be used to have one > "topic" covering multiple entries. Here, the topic is "lock". > > Here are a few use cases: > > - Checking locks: > RECORDER_TRACES=lock qemu > - Graphic visualization of locks: > RECORDER_TRACES="lock=state,id" qemu & > recorder_scope state > <Hit the 't' key to toggle timing display> > <Hit the 'c' key to dump the screen data as CSV> > cat recorder_scope_data-1.csv > > Signed-off-by: Christophe de Dinechin <dinechin@redhat.com> > --- > util/qemu-thread-common.h | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/util/qemu-thread-common.h b/util/qemu-thread-common.h > index 2af6b12085..0de07a471f 100644 > --- a/util/qemu-thread-common.h > +++ b/util/qemu-thread-common.h > @@ -15,6 +15,9 @@ > > #include "qemu/thread.h" > #include "trace.h" > +#include "trace/recorder.h" > + > +RECORDER_DEFINE(lock, 16, "Lock state"); > > static inline void qemu_mutex_post_init(QemuMutex *mutex) > { > @@ -23,12 +26,14 @@ static inline void qemu_mutex_post_init(QemuMutex *mutex) > mutex->line = 0; > #endif > mutex->initialized = true; > + record(lock, "Init state %d for %p", -1, mutex); > } > > static inline void qemu_mutex_pre_lock(QemuMutex *mutex, > const char *file, int line) > { > trace_qemu_mutex_lock(mutex, file, line); > + record(lock, "Locking state %d for %p", 1, mutex); > } > > static inline void qemu_mutex_post_lock(QemuMutex *mutex, > @@ -39,6 +44,7 @@ static inline void qemu_mutex_post_lock(QemuMutex *mutex, > mutex->line = line; > #endif > trace_qemu_mutex_locked(mutex, file, line); > + record(lock, "Locked state %d for %p", 2, mutex); > } > > static inline void qemu_mutex_pre_unlock(QemuMutex *mutex, > @@ -49,6 +55,7 @@ static inline void qemu_mutex_pre_unlock(QemuMutex *mutex, > mutex->line = 0; > #endif > trace_qemu_mutex_unlock(mutex, file, line); > + record(lock, "Unkocked state %d for %p", 0, mutex); > } IMHO the whole point of having the pluggable trace backend impls, is precisely that we don't have to add multiple different calls in the code. A single trace_qemu_mutex_unlock() is supposed to work with any backend. Regards, Daniel
On Fri, Jun 26, 2020 at 06:27:06PM +0200, Christophe de Dinechin wrote: > This is an example showing how the recorder can be used to have one > "topic" covering multiple entries. Here, the topic is "lock". > > Here are a few use cases: > > - Checking locks: > RECORDER_TRACES=lock qemu > - Graphic visualization of locks: > RECORDER_TRACES="lock=state,id" qemu & > recorder_scope state > <Hit the 't' key to toggle timing display> > <Hit the 'c' key to dump the screen data as CSV> > cat recorder_scope_data-1.csv > > Signed-off-by: Christophe de Dinechin <dinechin@redhat.com> > --- > util/qemu-thread-common.h | 7 +++++++ > 1 file changed, 7 insertions(+) Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
On Tue, Jun 30, 2020 at 01:41:36PM +0100, Daniel P. Berrangé wrote: > On Fri, Jun 26, 2020 at 06:27:06PM +0200, Christophe de Dinechin wrote: > IMHO the whole point of having the pluggable trace backend impls, is > precisely that we don't have to add multiple different calls in the > code. A single trace_qemu_mutex_unlock() is supposed to work with > any backend. I think an exception is okay when the other trace backends do not offer equivalent functionality. Who knows if anyone other than Christophe will use this functionality, but it doesn't cost much to allow it. Stefan
On Wed, Jul 01, 2020 at 05:09:06PM +0100, Stefan Hajnoczi wrote: > On Tue, Jun 30, 2020 at 01:41:36PM +0100, Daniel P. Berrangé wrote: > > On Fri, Jun 26, 2020 at 06:27:06PM +0200, Christophe de Dinechin wrote: > > IMHO the whole point of having the pluggable trace backend impls, is > > precisely that we don't have to add multiple different calls in the > > code. A single trace_qemu_mutex_unlock() is supposed to work with > > any backend. > > I think an exception is okay when the other trace backends do not offer > equivalent functionality. > > Who knows if anyone other than Christophe will use this functionality, > but it doesn't cost much to allow it. This patch is just an example though, suggesting this kind of usage is expected to done in other current trace probe locations. The trace wrapper has most of the information required already including a format string, so I'd think it could be wired up to the generator so we don't add extra record() statements through the codebase. At most it should require an extra annotation in the trace-events file to take the extra parameter for grouping, and other trace backends can ignore that. Regards, Daniel
On Wed, Jul 01, 2020 at 05:15:01PM +0100, Daniel P. Berrangé wrote: > On Wed, Jul 01, 2020 at 05:09:06PM +0100, Stefan Hajnoczi wrote: > > On Tue, Jun 30, 2020 at 01:41:36PM +0100, Daniel P. Berrangé wrote: > > > On Fri, Jun 26, 2020 at 06:27:06PM +0200, Christophe de Dinechin wrote: > > > IMHO the whole point of having the pluggable trace backend impls, is > > > precisely that we don't have to add multiple different calls in the > > > code. A single trace_qemu_mutex_unlock() is supposed to work with > > > any backend. > > > > I think an exception is okay when the other trace backends do not offer > > equivalent functionality. > > > > Who knows if anyone other than Christophe will use this functionality, > > but it doesn't cost much to allow it. > > This patch is just an example though, suggesting this kind of usage is > expected to done in other current trace probe locations. The trace wrapper > has most of the information required already including a format string, > so I'd think it could be wired up to the generator so we don't add extra > record() statements through the codebase. At most it should require an > extra annotation in the trace-events file to take the extra parameter > for grouping, and other trace backends can ignore that. It's true, it may be possible to put this functionality in the trace-events. Christophe: how does this differ from regular trace events and what extra information is needed? Stefan
On 2020-07-02 at 15:47 CEST, Stefan Hajnoczi wrote... > On Wed, Jul 01, 2020 at 05:15:01PM +0100, Daniel P. Berrangé wrote: >> On Wed, Jul 01, 2020 at 05:09:06PM +0100, Stefan Hajnoczi wrote: >> > On Tue, Jun 30, 2020 at 01:41:36PM +0100, Daniel P. Berrangé wrote: >> > > On Fri, Jun 26, 2020 at 06:27:06PM +0200, Christophe de Dinechin wrote: >> > > IMHO the whole point of having the pluggable trace backend impls, is >> > > precisely that we don't have to add multiple different calls in the >> > > code. A single trace_qemu_mutex_unlock() is supposed to work with >> > > any backend. >> > >> > I think an exception is okay when the other trace backends do not offer >> > equivalent functionality. >> > >> > Who knows if anyone other than Christophe will use this functionality, >> > but it doesn't cost much to allow it. >> >> This patch is just an example though, suggesting this kind of usage is >> expected to done in other current trace probe locations. The trace wrapper >> has most of the information required already including a format string, >> so I'd think it could be wired up to the generator so we don't add extra >> record() statements through the codebase. The primary purpose of the recorder is post-mortem dumps, flight recorder style. Tracing is only a secondary benefit. Not sure if it's worth making a distinction between events you want to record and those you want to trace. (Example: You might want to record all command line options, but almost never trace them) > At most it should require an >> extra annotation in the trace-events file to take the extra parameter >> for grouping, and other trace backends can ignore that. > > It's true, it may be possible to put this functionality in the > trace-events. Let me think more about integrating these features with other trace backends. See below for short-term impact. > Christophe: how does this differ from regular trace events and what > extra information is needed? - Grouping, as indicated above, mostly useful in practice to make selection of tracing topics easy (e.g. "modules") but also for real-time graphing, because typically a state change occurs in different functions, which is why I used locking as an example. - Self-documentation. Right now, the recorder back-end generates rather unhelpful help messages. - Trace buffer size. This is important to make post-mortem dumps usable if you record infrequent events alongside much higher-rate ones. For example, you may want a large buffer to record info about command-line option processing, the default 8 is definitely too small. - Support for %+s, which tells that a string is safe to print later (e.g. it is a compile-time constant, or never ever freed). - Support for custom formats, e.g. I use %v in the XL compiler for LLVM value pointers. This is a bit more advanced, but it would be neat to be able to print out QOM objects using %q :-) For the short term, what about providing trace-named wrappers around these additional recorder features? -- Cheers, Christophe de Dinechin (IRC c3d)
On Fri, Jul 03, 2020 at 12:12:14PM +0200, Christophe de Dinechin wrote: > > On 2020-07-02 at 15:47 CEST, Stefan Hajnoczi wrote... > > On Wed, Jul 01, 2020 at 05:15:01PM +0100, Daniel P. Berrangé wrote: > > At most it should require an > >> extra annotation in the trace-events file to take the extra parameter > >> for grouping, and other trace backends can ignore that. > > > > It's true, it may be possible to put this functionality in the > > trace-events. > > Let me think more about integrating these features with other trace > backends. See below for short-term impact. > > > Christophe: how does this differ from regular trace events and what > > extra information is needed? > > - Grouping, as indicated above, mostly useful in practice to make selection > of tracing topics easy (e.g. "modules") but also for real-time graphing, > because typically a state change occurs in different functions, which is > why I used locking as an example. In many cases we already have a form of grouping via the backdoor by using a common string prefix on trace point names. eg qio_task_new(void *task, void *source, void *func, void *opaque) "Task new task=%p source=%p func=%p opaque=%p" qio_task_complete(void *task) "Task complete task=%p" qio_task_thread_start(void *task, void *worker, void *opaque) "Task thread start task=%p worker=%p opaque=%p" qio_task_thread_run(void *task) "Task thread run task=%p" qio_task_thread_exit(void *task) "Task thread exit task=%p" qio_task_thread_result(void *task) "Task thread result task=%p" qio_task_thread_source_attach(void *task, void *source) "Task thread source attach task=%p source=%p" qio_task_thread_source_cancel(void *task, void *source) "Task thread source cancel task=%p source=%p" We could turn this into an explicit multi-level grouping concept, using "." separator for each level of the group. eg qio.task.new(void *task, void *source, void *func, void *opaque) "Task new task=%p source=%p func=%p opaque=%p" qio.task.complete(void *task) "Task complete task=%p" qio.task.thread_start(void *task, void *worker, void *opaque) "Task thread start task=%p worker=%p opaque=%p" qio.task.thread_run(void *task) "Task thread run task=%p" qio.task.thread_exit(void *task) "Task thread exit task=%p" qio.task.thread_result(void *task) "Task thread result task=%p" qio.task.thread_source_attach(void *task, void *source) "Task thread source attach task=%p source=%p" qio.task.thread_source_cancel(void *task, void *source) "Task thread source cancel task=%p source=%p" For backends that dont support groups, the "." can be turned back into a "_", and everything is unchanged. For backends that do support groups, we now have the info required. This would be useful for the systemtap backend because that has an explicit grouping concept already. ie we have probes definitions like: probe qemu.system.x86_64.qio_task_new = process("/usr/bin/qemu-system-x86_64").mark("qio_task_new") { task = $arg1; source = $arg2; func = $arg3; opaque = $arg4; } which we can now turn into probe qemu.system.x86_64.qio.task.new = process("/usr/bin/qemu-system-x86_64").mark("qio_task_new") { task = $arg1; source = $arg2; func = $arg3; opaque = $arg4; } the generated C trace macro would still use "_" not "." of course so qio_task_new(src, func, opaque) > - Self-documentation. Right now, the recorder back-end generates rather > unhelpful help messages. Not sure I understand the problem here. > - Trace buffer size. This is important to make post-mortem dumps usable if > you record infrequent events alongside much higher-rate ones. For example, > you may want a large buffer to record info about command-line option > processing, the default 8 is definitely too small. Sure, that's a problem for all the trace backends. They all need to be good at filtering / controlling what probes are fired to avoid signal getting lost in noise. The grouping feature would certainly help with that making it easier to control what it turned on/off. > - Support for %+s, which tells that a string is safe to print later (e.g. it > is a compile-time constant, or never ever freed). Seems we should be able to support that in the trace-events format string. Any other backend can turn "%+s" back into a normal "%s". > - Support for custom formats, e.g. I use %v in the XL compiler for LLVM > value pointers. This is a bit more advanced, but it would be neat to be > able to print out QOM objects using %q :-) Seems like a similar trick can be used here. The %v/%q could be turned back into a plain %p for backends which don't have the pretty-printing support. Regards, Daniel
On 2020-07-03 at 15:08 CEST, Daniel P. Berrangé wrote... > On Fri, Jul 03, 2020 at 12:12:14PM +0200, Christophe de Dinechin wrote: >> >> On 2020-07-02 at 15:47 CEST, Stefan Hajnoczi wrote... >> > On Wed, Jul 01, 2020 at 05:15:01PM +0100, Daniel P. Berrangé wrote: >> > At most it should require an >> >> extra annotation in the trace-events file to take the extra parameter >> >> for grouping, and other trace backends can ignore that. >> > >> > It's true, it may be possible to put this functionality in the >> > trace-events. >> >> Let me think more about integrating these features with other trace >> backends. See below for short-term impact. >> >> > Christophe: how does this differ from regular trace events and what >> > extra information is needed? >> >> - Grouping, as indicated above, mostly useful in practice to make selection >> of tracing topics easy (e.g. "modules") but also for real-time graphing, >> because typically a state change occurs in different functions, which is >> why I used locking as an example. > > In many cases we already have a form of grouping via the backdoor by > using a common string prefix on trace point names. eg > > qio_task_new(void *task, void *source, void *func, void *opaque) "Task new task=%p source=%p func=%p opaque=%p" > qio_task_complete(void *task) "Task complete task=%p" > qio_task_thread_start(void *task, void *worker, void *opaque) "Task thread start task=%p worker=%p opaque=%p" > qio_task_thread_run(void *task) "Task thread run task=%p" > qio_task_thread_exit(void *task) "Task thread exit task=%p" > qio_task_thread_result(void *task) "Task thread result task=%p" > qio_task_thread_source_attach(void *task, void *source) "Task thread source attach task=%p source=%p" > qio_task_thread_source_cancel(void *task, void *source) "Task thread source cancel task=%p source=%p" > > We could turn this into an explicit multi-level grouping concept, > using "." separator for each level of the group. eg Interesting idea. > > qio.task.new(void *task, void *source, void *func, void *opaque) "Task new task=%p source=%p func=%p opaque=%p" > qio.task.complete(void *task) "Task complete task=%p" > qio.task.thread_start(void *task, void *worker, void *opaque) "Task thread start task=%p worker=%p opaque=%p" > qio.task.thread_run(void *task) "Task thread run task=%p" > qio.task.thread_exit(void *task) "Task thread exit task=%p" > qio.task.thread_result(void *task) "Task thread result task=%p" > qio.task.thread_source_attach(void *task, void *source) "Task thread source attach task=%p source=%p" > qio.task.thread_source_cancel(void *task, void *source) "Task thread source cancel task=%p source=%p" > > > For backends that dont support groups, the "." can be turned back > into a "_", and everything is unchanged. For backends that do support > groups, we now have the info required. > > This would be useful for the systemtap backend because that has an > explicit grouping concept already. ie we have probes definitions > like: > > probe qemu.system.x86_64.qio_task_new = process("/usr/bin/qemu-system-x86_64").mark("qio_task_new") > { > task = $arg1; > source = $arg2; > func = $arg3; > opaque = $arg4; > } > > > which we can now turn into > > probe qemu.system.x86_64.qio.task.new = process("/usr/bin/qemu-system-x86_64").mark("qio_task_new") > { > task = $arg1; > source = $arg2; > func = $arg3; > opaque = $arg4; > } > > > the generated C trace macro would still use "_" not "." of course > so qio_task_new(src, func, opaque) > >> - Self-documentation. Right now, the recorder back-end generates rather >> unhelpful help messages. > > Not sure I understand the problem here. RECORDER_DEFINE specifies a text that is listed when you pass "help" as a trace name. There is no equivalent concept that I'm aware of in the existing qemu trace infrastructure. > >> - Trace buffer size. This is important to make post-mortem dumps usable if >> you record infrequent events alongside much higher-rate ones. For example, >> you may want a large buffer to record info about command-line option >> processing, the default 8 is definitely too small. > > Sure, that's a problem for all the trace backends. They all need to be > good at filtering / controlling what probes are fired to avoid signal > getting lost in noise. The grouping feature would certainly help with > that making it easier to control what it turned on/off. Well, for the recorder it's a slightly different problem. When you do a recorder post-mortem dump, entries are sorted and printed at that time. THis makes it possible to have a number of recent traces for high-frequency events, as well as some much older traces for low-frequency events. For example, you can have a buffer that records command-line options, and another that records events that occur typically once per second, and yet another that records high-frequency events. Then, when your program crashes and you look at a dump, you see a context that includes both recent high-frequency stuff as well as much older low-frequency events. See https://gitlab.com/c3d/recorder for examples. > >> - Support for %+s, which tells that a string is safe to print later (e.g. it >> is a compile-time constant, or never ever freed). > > Seems we should be able to support that in the trace-events format string. > Any other backend can turn "%+s" back into a normal "%s". Yes, but it's much more complicated than this simple patch series ;-) As is, the goal of this series is to not touch any other back-end. > >> - Support for custom formats, e.g. I use %v in the XL compiler for LLVM >> value pointers. This is a bit more advanced, but it would be neat to be >> able to print out QOM objects using %q :-) > > Seems like a similar trick can be used here. The %v/%q could be turned > back into a plain %p for backends which don't have the pretty-printing > support. Sure, but this is also touching the rest of the tracing infrastructure, which was a definite non-goal for this specific series. > > Regards, > Daniel -- Cheers, Christophe de Dinechin (IRC c3d)
diff --git a/util/qemu-thread-common.h b/util/qemu-thread-common.h index 2af6b12085..0de07a471f 100644 --- a/util/qemu-thread-common.h +++ b/util/qemu-thread-common.h @@ -15,6 +15,9 @@ #include "qemu/thread.h" #include "trace.h" +#include "trace/recorder.h" + +RECORDER_DEFINE(lock, 16, "Lock state"); static inline void qemu_mutex_post_init(QemuMutex *mutex) { @@ -23,12 +26,14 @@ static inline void qemu_mutex_post_init(QemuMutex *mutex) mutex->line = 0; #endif mutex->initialized = true; + record(lock, "Init state %d for %p", -1, mutex); } static inline void qemu_mutex_pre_lock(QemuMutex *mutex, const char *file, int line) { trace_qemu_mutex_lock(mutex, file, line); + record(lock, "Locking state %d for %p", 1, mutex); } static inline void qemu_mutex_post_lock(QemuMutex *mutex, @@ -39,6 +44,7 @@ static inline void qemu_mutex_post_lock(QemuMutex *mutex, mutex->line = line; #endif trace_qemu_mutex_locked(mutex, file, line); + record(lock, "Locked state %d for %p", 2, mutex); } static inline void qemu_mutex_pre_unlock(QemuMutex *mutex, @@ -49,6 +55,7 @@ static inline void qemu_mutex_pre_unlock(QemuMutex *mutex, mutex->line = 0; #endif trace_qemu_mutex_unlock(mutex, file, line); + record(lock, "Unkocked state %d for %p", 0, mutex); } #endif
This is an example showing how the recorder can be used to have one "topic" covering multiple entries. Here, the topic is "lock". Here are a few use cases: - Checking locks: RECORDER_TRACES=lock qemu - Graphic visualization of locks: RECORDER_TRACES="lock=state,id" qemu & recorder_scope state <Hit the 't' key to toggle timing display> <Hit the 'c' key to dump the screen data as CSV> cat recorder_scope_data-1.csv Signed-off-by: Christophe de Dinechin <dinechin@redhat.com> --- util/qemu-thread-common.h | 7 +++++++ 1 file changed, 7 insertions(+)