Message ID | 20210517142140.286153-1-y.karadz@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | Final fixes before KS 2.0 | expand |
On Mon, 17 May 2021 17:21:33 +0300 "Yordan Karadzhov (VMware)" <y.karadz@gmail.com> wrote: > v2 changes: > - Still showing all CPU plots from the new trace file when > appending [PATCH kernel-shark: Preserve open graphs when > appending data]. > - Setting "seq.buffer" to NULL after calling trace_seq_destroy() > in [PATCH kernel-shark: Fix the checking if "trace_seq" was destroyed] > - [PATCH kernel-shark: No slash at the end of KS_PLUGIN_INSTALL_PREFIX] > is new. Hi Yordan, I was playing a bit with kernelshark, and found that if I load a file and append one, exit, load them again, then click: File -> Sessions -> Restore Last Session It crashes. Looks to be something is freed and then reused, because when I ran it under gdb, it crashed in allocation of memory (asprintf). That usually means that something was freed twice, someplace else. Or freed and then used. -- Steve
On Mon, 17 May 2021 19:21:04 -0400 Steven Rostedt <rostedt@goodmis.org> wrote: > On Mon, 17 May 2021 17:21:33 +0300 > "Yordan Karadzhov (VMware)" <y.karadz@gmail.com> wrote: > > > v2 changes: > > - Still showing all CPU plots from the new trace file when > > appending [PATCH kernel-shark: Preserve open graphs when > > appending data]. > > - Setting "seq.buffer" to NULL after calling trace_seq_destroy() > > in [PATCH kernel-shark: Fix the checking if "trace_seq" was destroyed] > > - [PATCH kernel-shark: No slash at the end of KS_PLUGIN_INSTALL_PREFIX] > > is new. > > Hi Yordan, > > I was playing a bit with kernelshark, and found that if I load a file and > append one, exit, load them again, then click: > > File -> Sessions -> Restore Last Session > > It crashes. > > Looks to be something is freed and then reused, because when I ran it under > gdb, it crashed in allocation of memory (asprintf). That usually means that > something was freed twice, someplace else. Or freed and then used. > Running valgrind, reported this: ==6862== Invalid read of size 8 ==6862== at 0x494CA89: map_collection_back_request (libkshark-collection.c:474) static int map_collection_back_request(const struct kshark_entry_collection *col, struct kshark_entry_request *req) { size_t req_first, req_end; ssize_t col_index; int req_count; col_index = map_collection_request_init(col, req, false, &req_end); if (col_index == KS_EMPTY_BIN) return 0; /* * Now loop over the intervals of the collection going backwards till * the end of the inputted request and create a separate request for * each of those interest. */ req_count = 1; while (col_index >= 0 && req_end <= col->break_points[col_index]) { // col_index can be zero entering this loop. if (req_end >= col->resume_points[col_index]) { /* * The last entry of the original request is inside * the "col_index" collection interval. Close the * collection request here and return. */ req->n = req->first - req_end + 1; break; } /* * The last entry of the original request is outside of the * "col_index" interval. Close the collection request at the * end of this interval and move to the next one. Try to make * another request there. */ req->n = req->first - col->resume_points[col_index] + 1; --col_index; // col_index is decremented (-1) if (req_end > col->break_points[col_index]) { Reading a negative index in an array. Which is where valgrind reported. But I don't think this is what caused the crash. -- Steve
On 18.05.21 г. 2:21, Steven Rostedt wrote: > On Mon, 17 May 2021 17:21:33 +0300 > "Yordan Karadzhov (VMware)" <y.karadz@gmail.com> wrote: > >> v2 changes: >> - Still showing all CPU plots from the new trace file when >> appending [PATCH kernel-shark: Preserve open graphs when >> appending data]. >> - Setting "seq.buffer" to NULL after calling trace_seq_destroy() >> in [PATCH kernel-shark: Fix the checking if "trace_seq" was destroyed] >> - [PATCH kernel-shark: No slash at the end of KS_PLUGIN_INSTALL_PREFIX] >> is new. > > Hi Yordan, > > I was playing a bit with kernelshark, and found that if I load a file and > append one, exit, load them again, then click: > > File -> Sessions -> Restore Last Session > > It crashes. Unfortunately I am not able to reproduce the crash. maybe it has something to do with the particular data files you use. > > Looks to be something is freed and then reused, because when I ran it under > gdb, it crashed in allocation of memory (asprintf). That usually means that > something was freed twice, someplace else. Or freed and then used. Is it possible to send me a backtrace of the stack? Thanks! Yordan > > -- Steve >
On Tue, 18 May 2021 10:30:42 +0300 Yordan Karadzhov <y.karadz@gmail.com> wrote: > On 18.05.21 г. 2:21, Steven Rostedt wrote: > > On Mon, 17 May 2021 17:21:33 +0300 > > "Yordan Karadzhov (VMware)" <y.karadz@gmail.com> wrote: > > > >> v2 changes: > >> - Still showing all CPU plots from the new trace file when > >> appending [PATCH kernel-shark: Preserve open graphs when > >> appending data]. > >> - Setting "seq.buffer" to NULL after calling trace_seq_destroy() > >> in [PATCH kernel-shark: Fix the checking if "trace_seq" was destroyed] > >> - [PATCH kernel-shark: No slash at the end of KS_PLUGIN_INSTALL_PREFIX] > >> is new. > > > > Hi Yordan, > > > > I was playing a bit with kernelshark, and found that if I load a file and > > append one, exit, load them again, then click: > > > > File -> Sessions -> Restore Last Session > > > > It crashes. > > Unfortunately I am not able to reproduce the crash. maybe it has > something to do with the particular data files you use. BTW, sometimes I need to do it twice. That is, I hit "Restore Last Session" twice. As this is a memory corruption issue, it will behave differently on different machines. Also, I do get the message: "Usage of trace_seq after it was destroyed" > > > > > Looks to be something is freed and then reused, because when I ran it under > > gdb, it crashed in allocation of memory (asprintf). That usually means that > > something was freed twice, someplace else. Or freed and then used. > > Is it possible to send me a backtrace of the stack? Here's the backtrace from gdb: (gdb) bt #0 0x00007ffff63cb02a in __strlen_sse2 () from /lib64/libc.so.6 #1 0x00007ffff63994f8 in __vfprintf_internal () from /lib64/libc.so.6 #2 0x00007ffff63aa015 in __vasprintf_internal () from /lib64/libc.so.6 #3 0x00007ffff63844fa in asprintf () from /lib64/libc.so.6 #4 0x00007ffff7ec88f9 in tepdata_get_latency (entry=<optimized out>, stream=0x7fff70ff1320) at /work/git-local/kernel-shark.git/src/libkshark-tepdata.c:805 #5 tepdata_get_latency (stream=0x7fff70ff1320, entry=<optimized out>) at /work/git-local/kernel-shark.git/src/libkshark-tepdata.c:776 #6 0x00007ffff7f59a40 in KsViewModel::getValueStr (this=0x7fffffffc6e8, column=<optimized out>, row=0) at /work/git-local/kernel-shark.git/src/KsModels.cpp:358 #7 0x00007ffff7f59aa7 in KsViewModel::getValue (this=<optimized out>, column=<optimized out>, row=<optimized out>) at /work/git-local/kernel-shark.git/src/KsModels.cpp:377 #8 0x00007ffff7f59b37 in KsViewModel::data (this=<optimized out>, index=..., role=<optimized out>) at /work/git-local/kernel-shark.git/src/KsModels.cpp:312 #9 0x00007ffff6ac2139 in QSortFilterProxyModel::data(QModelIndex const&, int) const () from /lib64/libQt5Core.so.5 #10 0x00007ffff799a185 in QStyledItemDelegate::initStyleOption(QStyleOptionViewItem*, QModelIndex const&) const () from /lib64/libQt5Widgets.so.5 #11 0x00007ffff79997ea in QStyledItemDelegate::sizeHint(QStyleOptionViewItem const&, QModelIndex const&) const () from /lib64/libQt5Widgets.so.5 #12 0x00007ffff79c72ef in QTableViewPrivate::widthHintForIndex(QModelIndex const&, int, QStyleOptionViewItem const&) const () from /lib64/libQt5Widgets.so.5 #13 0x00007ffff79c7510 in QTableView::sizeHintForColumn(int) const () from /lib64/libQt5Widgets.so.5 --Type <RET> for more, q to quit, c to continue without paging-- #14 0x00007ffff7985142 in QHeaderViewPrivate::resizeSections(QHeaderView::ResizeMode, bool) () from /lib64/libQt5Widgets.so.5 #15 0x00007ffff7f80523 in KsTraceViewer::_resizeToContents (this=0x7fffffffc660) at /work/git-local/kernel-shark.git/src/KsTraceViewer.cpp:573 #16 0x00007ffff7f80eb3 in KsTraceViewer::loadData (this=this@entry=0x7fffffffc660, data= 0x7fffffffb508, data@entry=0x7fffffffc640) at /work/git-local/kernel-shark.git/src/KsTraceViewer.cpp:163 #17 0x00007ffff7f8ee6a in KsMainWindow::loadSession (this=0x7fffffffc5d0, fileName=...) at /work/git-local/kernel-shark.git/src/KsMainWindow.cpp:1434 #18 0x00007ffff7f8f165 in KsMainWindow::_restoreSession (this=0x7fffffffc5d0) at /work/git-local/kernel-shark.git/src/KsMainWindow.cpp:668 #19 0x00007ffff6b14386 in void doActivate<false>(QObject*, int, void**) () from /lib64/libQt5Core.so.5 #20 0x00007ffff770a646 in QAction::triggered(bool) () from /lib64/libQt5Widgets.so.5 #21 0x00007ffff770cf31 in QAction::activate(QAction::ActionEvent) () from /lib64/libQt5Widgets.so.5 #22 0x00007ffff788be9a in QMenuPrivate::activateCausedStack(QVector<QPointer<QWidget> > const&, QAction*, QAction::ActionEvent, bool) () from /lib64/libQt5Widgets.so.5 #23 0x00007ffff7893512 in QMenuPrivate::activateAction(QAction*, QAction::ActionEvent, bool) () from /lib64/libQt5Widgets.so.5 #24 0x00007ffff7751b1e in QWidget::event(QEvent*) () from /lib64/libQt5Widgets.so.5 #25 0x00007ffff7710ec3 in QApplicationPrivate::notify_helper(QObject*, QEvent*) () from /lib64/libQt5Widgets.so.5 #26 0x00007ffff7717eeb in QApplication::notify(QObject*, QEvent*) () from /lib64/libQt5Widgets.so.5 #27 0x00007ffff6ae4bd8 in QCoreApplication::notifyInternal2(QObject*, QEvent*) () from /lib64/libQt5Core.so.5 --Type <RET> for more, q to quit, c to continue without paging-- #28 0x00007ffff7716efa in QApplicationPrivate::sendMouseEvent(QWidget*, QMouseEvent*, QWidget*, QWidget*, QWidget**, QPointer<QWidget>&, bool, bool) () from /lib64/libQt5Widgets.so.5 #29 0x00007ffff776a8e3 in QWidgetWindow::handleMouseEvent(QMouseEvent*) () from /lib64/libQt5Widgets.so.5 #30 0x00007ffff776d6be in QWidgetWindow::event(QEvent*) () from /lib64/libQt5Widgets.so.5 #31 0x00007ffff7710ec3 in QApplicationPrivate::notify_helper(QObject*, QEvent*) () from /lib64/libQt5Widgets.so.5 #32 0x00007ffff6ae4bd8 in QCoreApplication::notifyInternal2(QObject*, QEvent*) () from /lib64/libQt5Core.so.5 #33 0x00007ffff70b7143 in QGuiApplicationPrivate::processMouseEvent(QWindowSystemInterfacePrivate::MouseEvent*) () from /lib64/libQt5Gui.so.5 #34 0x00007ffff70988cc in QWindowSystemInterface::sendWindowSystemEvents(QFlags<QEventLoop::ProcessEventsFlag>) () from /lib64/libQt5Gui.so.5 #35 0x00007fffe5c5747e in xcbSourceDispatch(_GSource*, int (*)(void*), void*) () from /lib64/libQt5XcbQpa.so.5 #36 0x00007ffff5450a9f in g_main_context_dispatch () from /lib64/libglib-2.0.so.0 #37 0x00007ffff54a2a98 in g_main_context_iterate.constprop () from /lib64/libglib-2.0.so.0 #38 0x00007ffff544de73 in g_main_context_iteration () from /lib64/libglib-2.0.so.0 #39 0x00007ffff6b316f3 in QEventDispatcherGlib::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () from /lib64/libQt5Core.so.5 #40 0x00007ffff6ae357b in QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>) () from /lib64/libQt5Core.so.5 #41 0x00007ffff6aeb1b4 in QCoreApplication::exec() () from /lib64/libQt5Core.so.5 #42 0x0000000000402aa1 in main (argc=<optimized out>, argv=<optimized out>) --Type <RET> for more, q to quit, c to continue without paging-- at /work/git-local/kernel-shark.git/src/kernelshark.cpp:154 You may want to play with valgrind some more. Attached is my .cache/kernelshark/ content. -- Steve
On 18.05.21 г. 15:46, Steven Rostedt wrote: >> Unfortunately I am not able to reproduce the crash. maybe it has >> something to do with the particular data files you use. > BTW, sometimes I need to do it twice. That is, I hit "Restore Last Session" > twice. > I restored the session 20 times in a row without having a crash. > As this is a memory corruption issue, it will behave differently on > different machines. Also, I do get the message: > > "Usage of trace_seq after it was destroyed" > >>> Looks to be something is freed and then reused, because when I ran it under >>> gdb, it crashed in allocation of memory (asprintf). That usually means that >>> something was freed twice, someplace else. Or freed and then used. >> Is it possible to send me a backtrace of the stack? > Here's the backtrace from gdb: > > (gdb) bt > #0 0x00007ffff63cb02a in __strlen_sse2 () from /lib64/libc.so.6 > #1 0x00007ffff63994f8 in __vfprintf_internal () from /lib64/libc.so.6 > #2 0x00007ffff63aa015 in __vasprintf_internal () from /lib64/libc.so.6 > #3 0x00007ffff63844fa in asprintf () from /lib64/libc.so.6 > #4 0x00007ffff7ec88f9 in tepdata_get_latency (entry=<optimized out>, This backtrace is very suspicious. It is exactly the same crash I had before applying [PATCH v2 6/7] kernel-shark: Fix the checking if "trace_seq" was destroyed. I guess because of some reason it still fails to detect that the trace_seq was destroyed and needs to be initialized again. Thanks! Yordan
On Tue, 18 May 2021 15:58:55 +0300 Yordan Karadzhov <y.karadz@gmail.com> wrote: > On 18.05.21 г. 15:46, Steven Rostedt wrote: > >> Unfortunately I am not able to reproduce the crash. maybe it has > >> something to do with the particular data files you use. > > BTW, sometimes I need to do it twice. That is, I hit "Restore Last Session" > > twice. > > > > I restored the session 20 times in a row without having a crash. Nevermind. I just realized that I was in the process of rebasing your patches and testing each one when I noticed this bug. After applying the rest of the patches, I'm not able to reproduce it. I thought I was done with the rebase (doing too many things in parallel isn't the best thing :-p). OK, I think this is good to go (although the col_index looks like it can still be referenced as a negative). Sorry for the goose chase on the crash :-/ -- Steve