Message ID | 20190327160323.31654-8-ykaradzhov@vmware.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | NumPy Interface for KernelShark | expand |
On Wed, Mar 27, 2019 at 6:04 PM Yordan Karadzhov <ykaradzhov@vmware.com> wrote: > > The example script plots the distribution (histogram) of the pulseaudio > wake-up times and finds the biggest latency. The script also generates > a KernelShark session descriptor file (JSON). The session descriptor file > can be used by the KernelSherk GUI to open a session which will directly > visualize the largest wake-up latency. > > Signed-off-by: Yordan Karadzhov <ykaradzhov@vmware.com> > --- > kernel-shark/bin/sched_wakeup.py | 96 ++++++++++++++++++++++++++++++++ > 1 file changed, 96 insertions(+) > create mode 100755 kernel-shark/bin/sched_wakeup.py > > diff --git a/kernel-shark/bin/sched_wakeup.py b/kernel-shark/bin/sched_wakeup.py <snip> > +while (i >= 0): > + if (evt[i] == ss_eid): Those while (...)s and if (...)s are very C style Python code :) Usually parentheses are needed only when the expression inside is too long and one wants to break it on several lines. Since we don't have a blessed Python style guide, I would suggest sticking to pep8[1]. There's also a pep8 checker tool[2] which comes handy and can point out unidiomatic code. Cheers, -- Slavi [1] https://www.python.org/dev/peps/pep-0008/ [2] `apt install pep8` on Debian systems
On Wed, Mar 27, 2019 at 6:04 PM Yordan Karadzhov <ykaradzhov@vmware.com> wrote: [...] > diff --git a/kernel-shark/bin/sched_wakeup.py b/kernel-shark/bin/sched_wakeup.py [...] > +ks.open_file(fname) > +ofst, cpu, evt, pid, ts = ks.load_data() Returning a tuple of 5 numpy arrays as API will be impossible to extend in the future without breaking existing code. It's also easy for users to get the order of the return values wrong. Have you thought of returning a container object that's holding the numpy arrays? Pandas DataFrame[1] comes to mind although I don't know if it's worth to take Pandas as dependency for this. [1] https://pandas.pydata.org/pandas-docs/version/0.23.4/generated/pandas.DataFrame.html -- Slavi
Hi Slavi, Thanks a lot for the detailed review of this patch-set! I am about to send version 2 of NumPy interface patches which tries to address all the problems that you found. Please review the new version carefully and be as critical as possible. Thanks! Yordan On 28.03.19 г. 14:47 ч., Slavomir Kaslev wrote: > On Wed, Mar 27, 2019 at 6:04 PM Yordan Karadzhov <ykaradzhov@vmware.com> wrote: > [...] >> diff --git a/kernel-shark/bin/sched_wakeup.py b/kernel-shark/bin/sched_wakeup.py > > [...] > >> +ks.open_file(fname) >> +ofst, cpu, evt, pid, ts = ks.load_data() > > Returning a tuple of 5 numpy arrays as API will be impossible to > extend in the future without breaking existing code. It's also easy > for users to get the order of the return values wrong. Have you > thought of returning a container object that's holding the numpy > arrays? Pandas DataFrame[1] comes to mind although I don't know if > it's worth to take Pandas as dependency for this. > > [1] https://pandas.pydata.org/pandas-docs/version/0.23.4/generated/pandas.DataFrame.html > > -- Slavi >
diff --git a/kernel-shark/bin/sched_wakeup.py b/kernel-shark/bin/sched_wakeup.py new file mode 100755 index 0000000..79dbedd --- /dev/null +++ b/kernel-shark/bin/sched_wakeup.py @@ -0,0 +1,96 @@ +#!/usr/bin/env python2 + +import matplotlib.pyplot as plt +import libkshark_wrapper as ks +import scipy.stats as st +import numpy as np +import array +import json +import sys + +fname = str(sys.argv[1]) + +ks.open_file(fname) +ofst, cpu, evt, pid, ts = ks.load_data() +tasks = ks.get_tasks() +task_pid = tasks["pulseaudio"] + +ss_eid = ks.event_id("sched", "sched_switch"); +w_eid = ks.event_id("sched", "sched_waking"); + +i = evt.size - 1 +dt = [] +delta_max = i_ss_max = i_sw_max = 0 + +while (i >= 0): + if (evt[i] == ss_eid): + next_pid = ks.read_event_field(offset=ofst[i], + sys="sched", + event="sched_switch", + field="next_pid") + + if (next_pid == task_pid): + time_ss = ts[i] + index_ss = i + + while(i >= 0): + i = i - 1 + if (evt[i] == w_eid): + waking_pid = ks.read_event_field(offset=ofst[i], + sys="sched", + event="sched_waking", + field="pid") + + if (waking_pid == task_pid): + delta = (time_ss - ts[i]) / 1000. + #print delta + dt.append(delta); + if (delta > delta_max): + print "lat. max: ", delta + i_ss_max = index_ss + i_sw_max = i + delta_max = delta + + break + + i = i - 1 + +desc = st.describe(np.array(dt)) +print desc + +fig, ax = plt.subplots(nrows=1, ncols=1) +fig.set_figheight(6) +fig.set_figwidth(7) + +rect = fig.patch +rect.set_facecolor('white') + +ax.set_xlabel('latency [$\mu$s]') +plt.yscale('log') +ax.hist(dt, bins=(200), histtype=u'step'); +plt.show() + +sname = 'sched.json' +ks.new_session(fname, sname) + +with open(sname, "r+") as s: + session = json.load(s) + session['TaskPlots'] = [task_pid] + session['CPUPlots'] = [int(cpu[i_sw_max]), int(cpu[i_ss_max])] + + delta = ts[i_ss_max] - ts[i_sw_max] + tmin = int(ts[i_sw_max] - delta) + tmax = int(ts[i_ss_max] + delta) + session['Model']['range'] = [tmin, tmax] + + session['Markers']['markA']['isSet'] = True + session['Markers']['markA']['row'] = int(i_sw_max) + + session['Markers']['markB']['isSet'] = True + session['Markers']['markB']['row'] = int(i_ss_max) + + session['ViewTop'] = int(i_sw_max) - 5 + + ks.save_session(session, s) + +ks.close()
The example script plots the distribution (histogram) of the pulseaudio wake-up times and finds the biggest latency. The script also generates a KernelShark session descriptor file (JSON). The session descriptor file can be used by the KernelSherk GUI to open a session which will directly visualize the largest wake-up latency. Signed-off-by: Yordan Karadzhov <ykaradzhov@vmware.com> --- kernel-shark/bin/sched_wakeup.py | 96 ++++++++++++++++++++++++++++++++ 1 file changed, 96 insertions(+) create mode 100755 kernel-shark/bin/sched_wakeup.py