mbox series

[v2,0/7] Final fixes before KS 2.0

Message ID 20210517142140.286153-1-y.karadz@gmail.com (mailing list archive)
Headers show
Series Final fixes before KS 2.0 | expand

Message

Yordan Karadzhov May 17, 2021, 2:21 p.m. UTC
v2 changes:
 - Still showing all CPU plots from the new trace file when
   appending [PATCH kernel-shark: Preserve open graphs when
   appending data].
 - Setting "seq.buffer" to NULL after calling trace_seq_destroy()
   in [PATCH kernel-shark: Fix the checking if "trace_seq" was destroyed]
 - [PATCH kernel-shark: No slash at the end of KS_PLUGIN_INSTALL_PREFIX]
   is new.
 

Yordan Karadzhov (VMware) (7):
  kernel-shark: Preserve markers when appending data
  kernel-shark: Preserve open graphs when appending data
  kernel-shark: Clear before loading new session
  kernel-shark: Better handling of plugins when appending data file
  kernel-shark: Do draw the combo point of the mark
  kernel-shark: Fix the checking if "trace_seq" was destroyed
  kernel-shark: No slash at the end of KS_PLUGIN_INSTALL_PREFIX

 CMakeLists.txt          |  2 +-
 src/KsDualMarker.hpp    |  6 ++++++
 src/KsGLWidget.cpp      | 40 ++++++++++++++++++++++++++--------------
 src/KsGLWidget.hpp      |  4 +++-
 src/KsMainWindow.cpp    | 34 +++++++++++++++++++++++++++++-----
 src/KsPlotTools.cpp     |  3 +++
 src/KsSession.cpp       |  1 +
 src/KsTraceGraph.cpp    | 10 +++++++---
 src/KsTraceGraph.hpp    |  2 +-
 src/KsUtils.cpp         |  4 ++--
 src/libkshark-tepdata.c |  4 +++-
 11 files changed, 82 insertions(+), 28 deletions(-)

Comments

Steven Rostedt May 17, 2021, 11:21 p.m. UTC | #1
On Mon, 17 May 2021 17:21:33 +0300
"Yordan Karadzhov (VMware)" <y.karadz@gmail.com> wrote:

> v2 changes:
>  - Still showing all CPU plots from the new trace file when
>    appending [PATCH kernel-shark: Preserve open graphs when
>    appending data].
>  - Setting "seq.buffer" to NULL after calling trace_seq_destroy()
>    in [PATCH kernel-shark: Fix the checking if "trace_seq" was destroyed]
>  - [PATCH kernel-shark: No slash at the end of KS_PLUGIN_INSTALL_PREFIX]
>    is new.

Hi Yordan,

I was playing a bit with kernelshark, and found that if I load a file and
append one, exit, load them again, then click:

  File -> Sessions -> Restore Last Session

It crashes.

Looks to be something is freed and then reused, because when I ran it under
gdb, it crashed in allocation of memory (asprintf). That usually means that
something was freed twice, someplace else. Or freed and then used.

-- Steve
Steven Rostedt May 17, 2021, 11:28 p.m. UTC | #2
On Mon, 17 May 2021 19:21:04 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> On Mon, 17 May 2021 17:21:33 +0300
> "Yordan Karadzhov (VMware)" <y.karadz@gmail.com> wrote:
> 
> > v2 changes:
> >  - Still showing all CPU plots from the new trace file when
> >    appending [PATCH kernel-shark: Preserve open graphs when
> >    appending data].
> >  - Setting "seq.buffer" to NULL after calling trace_seq_destroy()
> >    in [PATCH kernel-shark: Fix the checking if "trace_seq" was destroyed]
> >  - [PATCH kernel-shark: No slash at the end of KS_PLUGIN_INSTALL_PREFIX]
> >    is new.  
> 
> Hi Yordan,
> 
> I was playing a bit with kernelshark, and found that if I load a file and
> append one, exit, load them again, then click:
> 
>   File -> Sessions -> Restore Last Session
> 
> It crashes.
> 
> Looks to be something is freed and then reused, because when I ran it under
> gdb, it crashed in allocation of memory (asprintf). That usually means that
> something was freed twice, someplace else. Or freed and then used.
> 

Running valgrind, reported this:

==6862== Invalid read of size 8
==6862==    at 0x494CA89: map_collection_back_request (libkshark-collection.c:474)

static int
map_collection_back_request(const struct kshark_entry_collection *col,
			    struct kshark_entry_request *req)
{
	size_t req_first, req_end;
	ssize_t col_index;
	int req_count;

	col_index = map_collection_request_init(col, req, false, &req_end);
	if (col_index == KS_EMPTY_BIN)
		return 0;

	/*
	 * Now loop over the intervals of the collection going backwards till
	 * the end of the inputted request and create a separate request for
	 * each of those interest.
	 */
	req_count = 1;
	while (col_index >= 0 && req_end <= col->break_points[col_index]) {

// col_index can be zero entering this loop.

		if (req_end >= col->resume_points[col_index]) {
			/*
			 * The last entry of the original request is inside
			 * the "col_index" collection interval. Close the
			 * collection request here and return.
			 */
			req->n = req->first - req_end + 1;
			break;
		}

		/*
		 * The last entry of the original request is outside of the
		 * "col_index" interval. Close the collection request at the
		 * end of this interval and move to the next one. Try to make
		 * another request there.
		 */
		req->n = req->first -
			 col->resume_points[col_index] + 1;

		--col_index;

// col_index is decremented (-1)

		if (req_end > col->break_points[col_index]) {

Reading a negative index in an array.

Which is where valgrind reported. But I don't think this is what caused the
crash.

-- Steve
Yordan Karadzhov May 18, 2021, 7:30 a.m. UTC | #3
On 18.05.21 г. 2:21, Steven Rostedt wrote:
> On Mon, 17 May 2021 17:21:33 +0300
> "Yordan Karadzhov (VMware)" <y.karadz@gmail.com> wrote:
> 
>> v2 changes:
>>   - Still showing all CPU plots from the new trace file when
>>     appending [PATCH kernel-shark: Preserve open graphs when
>>     appending data].
>>   - Setting "seq.buffer" to NULL after calling trace_seq_destroy()
>>     in [PATCH kernel-shark: Fix the checking if "trace_seq" was destroyed]
>>   - [PATCH kernel-shark: No slash at the end of KS_PLUGIN_INSTALL_PREFIX]
>>     is new.
> 
> Hi Yordan,
> 
> I was playing a bit with kernelshark, and found that if I load a file and
> append one, exit, load them again, then click:
> 
>    File -> Sessions -> Restore Last Session
> 
> It crashes.

Unfortunately I am not able to reproduce the crash. maybe it has 
something to do with the particular data files you use.

> 
> Looks to be something is freed and then reused, because when I ran it under
> gdb, it crashed in allocation of memory (asprintf). That usually means that
> something was freed twice, someplace else. Or freed and then used.

Is it possible to send me a backtrace of the stack?

Thanks!
Yordan

> 
> -- Steve
>
Steven Rostedt May 18, 2021, 12:46 p.m. UTC | #4
On Tue, 18 May 2021 10:30:42 +0300
Yordan Karadzhov <y.karadz@gmail.com> wrote:

> On 18.05.21 г. 2:21, Steven Rostedt wrote:
> > On Mon, 17 May 2021 17:21:33 +0300
> > "Yordan Karadzhov (VMware)" <y.karadz@gmail.com> wrote:
> >   
> >> v2 changes:
> >>   - Still showing all CPU plots from the new trace file when
> >>     appending [PATCH kernel-shark: Preserve open graphs when
> >>     appending data].
> >>   - Setting "seq.buffer" to NULL after calling trace_seq_destroy()
> >>     in [PATCH kernel-shark: Fix the checking if "trace_seq" was destroyed]
> >>   - [PATCH kernel-shark: No slash at the end of KS_PLUGIN_INSTALL_PREFIX]
> >>     is new.  
> > 
> > Hi Yordan,
> > 
> > I was playing a bit with kernelshark, and found that if I load a file and
> > append one, exit, load them again, then click:
> > 
> >    File -> Sessions -> Restore Last Session
> > 
> > It crashes.  
> 
> Unfortunately I am not able to reproduce the crash. maybe it has 
> something to do with the particular data files you use.

BTW, sometimes I need to do it twice. That is, I hit "Restore Last Session"
twice.

As this is a memory corruption issue, it will behave differently on
different machines. Also, I do get the message:

  "Usage of trace_seq after it was destroyed"

> 
> > 
> > Looks to be something is freed and then reused, because when I ran it under
> > gdb, it crashed in allocation of memory (asprintf). That usually means that
> > something was freed twice, someplace else. Or freed and then used.  
> 
> Is it possible to send me a backtrace of the stack?

Here's the backtrace from gdb:

(gdb) bt
#0  0x00007ffff63cb02a in __strlen_sse2 () from /lib64/libc.so.6
#1  0x00007ffff63994f8 in __vfprintf_internal () from /lib64/libc.so.6
#2  0x00007ffff63aa015 in __vasprintf_internal () from /lib64/libc.so.6
#3  0x00007ffff63844fa in asprintf () from /lib64/libc.so.6
#4  0x00007ffff7ec88f9 in tepdata_get_latency (entry=<optimized out>, 
    stream=0x7fff70ff1320)
    at /work/git-local/kernel-shark.git/src/libkshark-tepdata.c:805
#5  tepdata_get_latency (stream=0x7fff70ff1320, entry=<optimized out>)
    at /work/git-local/kernel-shark.git/src/libkshark-tepdata.c:776
#6  0x00007ffff7f59a40 in KsViewModel::getValueStr (this=0x7fffffffc6e8, 
    column=<optimized out>, row=0)
    at /work/git-local/kernel-shark.git/src/KsModels.cpp:358
#7  0x00007ffff7f59aa7 in KsViewModel::getValue (this=<optimized out>, 
    column=<optimized out>, row=<optimized out>)
    at /work/git-local/kernel-shark.git/src/KsModels.cpp:377
#8  0x00007ffff7f59b37 in KsViewModel::data (this=<optimized out>, index=..., 
    role=<optimized out>) at /work/git-local/kernel-shark.git/src/KsModels.cpp:312
#9  0x00007ffff6ac2139 in QSortFilterProxyModel::data(QModelIndex const&, int) const ()
   from /lib64/libQt5Core.so.5
#10 0x00007ffff799a185 in QStyledItemDelegate::initStyleOption(QStyleOptionViewItem*, QModelIndex const&) const () from /lib64/libQt5Widgets.so.5
#11 0x00007ffff79997ea in QStyledItemDelegate::sizeHint(QStyleOptionViewItem const&, QModelIndex const&) const () from /lib64/libQt5Widgets.so.5
#12 0x00007ffff79c72ef in QTableViewPrivate::widthHintForIndex(QModelIndex const&, int, QStyleOptionViewItem const&) const () from /lib64/libQt5Widgets.so.5
#13 0x00007ffff79c7510 in QTableView::sizeHintForColumn(int) const ()
   from /lib64/libQt5Widgets.so.5
--Type <RET> for more, q to quit, c to continue without paging--
#14 0x00007ffff7985142 in QHeaderViewPrivate::resizeSections(QHeaderView::ResizeMode, bool) () from /lib64/libQt5Widgets.so.5
#15 0x00007ffff7f80523 in KsTraceViewer::_resizeToContents (this=0x7fffffffc660)
    at /work/git-local/kernel-shark.git/src/KsTraceViewer.cpp:573
#16 0x00007ffff7f80eb3 in KsTraceViewer::loadData (this=this@entry=0x7fffffffc660, data=
    0x7fffffffb508, data@entry=0x7fffffffc640)
    at /work/git-local/kernel-shark.git/src/KsTraceViewer.cpp:163
#17 0x00007ffff7f8ee6a in KsMainWindow::loadSession (this=0x7fffffffc5d0, fileName=...)
    at /work/git-local/kernel-shark.git/src/KsMainWindow.cpp:1434
#18 0x00007ffff7f8f165 in KsMainWindow::_restoreSession (this=0x7fffffffc5d0)
    at /work/git-local/kernel-shark.git/src/KsMainWindow.cpp:668
#19 0x00007ffff6b14386 in void doActivate<false>(QObject*, int, void**) ()
   from /lib64/libQt5Core.so.5
#20 0x00007ffff770a646 in QAction::triggered(bool) () from /lib64/libQt5Widgets.so.5
#21 0x00007ffff770cf31 in QAction::activate(QAction::ActionEvent) ()
   from /lib64/libQt5Widgets.so.5
#22 0x00007ffff788be9a in QMenuPrivate::activateCausedStack(QVector<QPointer<QWidget> > const&, QAction*, QAction::ActionEvent, bool) () from /lib64/libQt5Widgets.so.5
#23 0x00007ffff7893512 in QMenuPrivate::activateAction(QAction*, QAction::ActionEvent, bool) () from /lib64/libQt5Widgets.so.5
#24 0x00007ffff7751b1e in QWidget::event(QEvent*) () from /lib64/libQt5Widgets.so.5
#25 0x00007ffff7710ec3 in QApplicationPrivate::notify_helper(QObject*, QEvent*) ()
   from /lib64/libQt5Widgets.so.5
#26 0x00007ffff7717eeb in QApplication::notify(QObject*, QEvent*) ()
   from /lib64/libQt5Widgets.so.5
#27 0x00007ffff6ae4bd8 in QCoreApplication::notifyInternal2(QObject*, QEvent*) ()
   from /lib64/libQt5Core.so.5
--Type <RET> for more, q to quit, c to continue without paging--
#28 0x00007ffff7716efa in QApplicationPrivate::sendMouseEvent(QWidget*, QMouseEvent*, QWidget*, QWidget*, QWidget**, QPointer<QWidget>&, bool, bool) ()
   from /lib64/libQt5Widgets.so.5
#29 0x00007ffff776a8e3 in QWidgetWindow::handleMouseEvent(QMouseEvent*) ()
   from /lib64/libQt5Widgets.so.5
#30 0x00007ffff776d6be in QWidgetWindow::event(QEvent*) ()
   from /lib64/libQt5Widgets.so.5
#31 0x00007ffff7710ec3 in QApplicationPrivate::notify_helper(QObject*, QEvent*) ()
   from /lib64/libQt5Widgets.so.5
#32 0x00007ffff6ae4bd8 in QCoreApplication::notifyInternal2(QObject*, QEvent*) ()
   from /lib64/libQt5Core.so.5
#33 0x00007ffff70b7143 in QGuiApplicationPrivate::processMouseEvent(QWindowSystemInterfacePrivate::MouseEvent*) () from /lib64/libQt5Gui.so.5
#34 0x00007ffff70988cc in QWindowSystemInterface::sendWindowSystemEvents(QFlags<QEventLoop::ProcessEventsFlag>) () from /lib64/libQt5Gui.so.5
#35 0x00007fffe5c5747e in xcbSourceDispatch(_GSource*, int (*)(void*), void*) ()
   from /lib64/libQt5XcbQpa.so.5
#36 0x00007ffff5450a9f in g_main_context_dispatch () from /lib64/libglib-2.0.so.0
#37 0x00007ffff54a2a98 in g_main_context_iterate.constprop ()
   from /lib64/libglib-2.0.so.0
#38 0x00007ffff544de73 in g_main_context_iteration () from /lib64/libglib-2.0.so.0
#39 0x00007ffff6b316f3 in QEventDispatcherGlib::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () from /lib64/libQt5Core.so.5
#40 0x00007ffff6ae357b in QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>) ()
   from /lib64/libQt5Core.so.5
#41 0x00007ffff6aeb1b4 in QCoreApplication::exec() () from /lib64/libQt5Core.so.5
#42 0x0000000000402aa1 in main (argc=<optimized out>, argv=<optimized out>)
--Type <RET> for more, q to quit, c to continue without paging--
    at /work/git-local/kernel-shark.git/src/kernelshark.cpp:154

You may want to play with valgrind some more.

Attached is my .cache/kernelshark/ content.

-- Steve
Yordan Karadzhov May 18, 2021, 12:58 p.m. UTC | #5
On 18.05.21 г. 15:46, Steven Rostedt wrote:
>> Unfortunately I am not able to reproduce the crash. maybe it has
>> something to do with the particular data files you use.
> BTW, sometimes I need to do it twice. That is, I hit "Restore Last Session"
> twice.
> 

I restored the session 20 times in a row without having a crash.


> As this is a memory corruption issue, it will behave differently on
> different machines. Also, I do get the message:
> 
>    "Usage of trace_seq after it was destroyed"
> 
>>> Looks to be something is freed and then reused, because when I ran it under
>>> gdb, it crashed in allocation of memory (asprintf). That usually means that
>>> something was freed twice, someplace else. Or freed and then used.
>> Is it possible to send me a backtrace of the stack?
> Here's the backtrace from gdb:
> 
> (gdb) bt
> #0  0x00007ffff63cb02a in __strlen_sse2 () from /lib64/libc.so.6
> #1  0x00007ffff63994f8 in __vfprintf_internal () from /lib64/libc.so.6
> #2  0x00007ffff63aa015 in __vasprintf_internal () from /lib64/libc.so.6
> #3  0x00007ffff63844fa in asprintf () from /lib64/libc.so.6
> #4  0x00007ffff7ec88f9 in tepdata_get_latency (entry=<optimized out>,

This backtrace is very suspicious. It is exactly the same crash I had 
before applying [PATCH v2 6/7] kernel-shark: Fix the checking if 
"trace_seq" was destroyed.

I guess because of some reason it still fails to detect that the 
trace_seq was destroyed and needs to be initialized again.

Thanks!
Yordan
Steven Rostedt May 18, 2021, 1:44 p.m. UTC | #6
On Tue, 18 May 2021 15:58:55 +0300
Yordan Karadzhov <y.karadz@gmail.com> wrote:

> On 18.05.21 г. 15:46, Steven Rostedt wrote:
> >> Unfortunately I am not able to reproduce the crash. maybe it has
> >> something to do with the particular data files you use.  
> > BTW, sometimes I need to do it twice. That is, I hit "Restore Last Session"
> > twice.
> >   
> 
> I restored the session 20 times in a row without having a crash.

Nevermind. I just realized that I was in the process of rebasing your
patches and testing each one when I noticed this bug.

After applying the rest of the patches, I'm not able to reproduce it. I
thought I was done with the rebase (doing too many things in parallel
isn't the best thing :-p).

OK, I think this is good to go (although the col_index looks like it can
still be referenced as a negative).

Sorry for the goose chase on the crash :-/

-- Steve