Message ID | 20170427121458.2be577cc@gandalf.local.home (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On (04/27/17 12:14), Steven Rostedt wrote: [..] > I tried this patch. It's better because I get the end of the trace, but > I do lose the beginning of it: > > ** 196358 printk messages dropped ** [ 102.321182] perf-5981 0.... 12983650us : d_path <-seq_path many thanks! so we now drop messages from logbuf, not from per-CPU buffers. that "queue printk_deferred irq_work on every online CPU when we bypass per-CPU buffers from NMI" idea *probably* might help here - we need someone to emit messages from the logbuf while we printk from NMI. there is still a possibility that we can drop messages, though, since log_store() from NMI CPU can be much-much faster than call_console_drivers() on other CPU. -ss
On Fri 2017-04-28 10:35:32, Sergey Senozhatsky wrote: > On (04/27/17 12:14), Steven Rostedt wrote: > [..] > > I tried this patch. It's better because I get the end of the trace, but > > I do lose the beginning of it: > > > > ** 196358 printk messages dropped ** [ 102.321182] perf-5981 0.... 12983650us : d_path <-seq_path > > many thanks! > > so we now drop messages from logbuf, not from per-CPU buffers. that > "queue printk_deferred irq_work on every online CPU when we bypass per-CPU > buffers from NMI" idea *probably* might help here - we need someone to emit > messages from the logbuf while we printk from NMI. there is still a > possibility that we can drop messages, though, since log_store() from NMI > CPU can be much-much faster than call_console_drivers() on other CPU. ftrace log is dumped via trace_panic_notifier. It is done after smp_send_stop(). It means that only a single CPU is available and it is NMI context at the moment. One possibility might be to put printk into a special mode and drop the last messages instead of the first ones. But this would need to be configurable. Of course, if the problem is reproducible, the easiest solution is to use bigger main log buffer, for example boot with log_buf_len=32M. Best Regards, Petr
On Fri, 28 Apr 2017 14:57:25 +0200 Petr Mladek <pmladek@suse.com> wrote: > Of course, if the problem is reproducible, the easiest solution > is to use bigger main log buffer, for example boot with > log_buf_len=32M. Of course that may not be enough. Especially when I have a machine with 240 CPUs. But it also has a ton of RAM, I could easily do log_buf_len=32G -- Steve
Index: linux-trace.git/kernel/trace/trace_functions.c =================================================================== --- linux-trace.git.orig/kernel/trace/trace_functions.c +++ linux-trace.git/kernel/trace/trace_functions.c @@ -469,8 +469,11 @@ ftrace_cpudump_probe(unsigned long ip, u struct trace_array *tr, struct ftrace_probe_ops *ops, void *data) { - if (update_count(ops, ip, data)) - ftrace_dump(DUMP_ORIG); + char *killer = NULL; + + panic_on_oops = 1; /* force panic */ + wmb(); + *killer = 1; } static int