diff mbox series

[3/3] Documentation/rtla: Add timerlat-top auto-analysis options

Message ID a3f5fb49432a55c3323b18725fc6e702f16ccc79.1675179318.git.bristot@kernel.org (mailing list archive)
State Accepted
Headers show
Series rtla: Timerlat auto-analysis | expand

Commit Message

Daniel Bristot de Oliveira Jan. 31, 2023, 3:38 p.m. UTC
Add the new options to the man page, as well as updating the
example to include the new output.

Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
---
 .../tools/rtla/common_timerlat_aa.rst         |   7 +
 .../tools/rtla/rtla-timerlat-top.rst          | 164 ++++++++----------
 2 files changed, 80 insertions(+), 91 deletions(-)
 create mode 100644 Documentation/tools/rtla/common_timerlat_aa.rst
diff mbox series

Patch

diff --git a/Documentation/tools/rtla/common_timerlat_aa.rst b/Documentation/tools/rtla/common_timerlat_aa.rst
new file mode 100644
index 000000000000..077029e6b289
--- /dev/null
+++ b/Documentation/tools/rtla/common_timerlat_aa.rst
@@ -0,0 +1,7 @@ 
+**--dump-tasks**
+
+        prints the task running on all CPUs if stop conditions are met (depends on !--no-aa)
+
+**--no-aa**
+
+        disable auto-analysis, reducing rtla timerlat cpu usage
diff --git a/Documentation/tools/rtla/rtla-timerlat-top.rst b/Documentation/tools/rtla/rtla-timerlat-top.rst
index 7c4e4b109493..73799c1150ad 100644
--- a/Documentation/tools/rtla/rtla-timerlat-top.rst
+++ b/Documentation/tools/rtla/rtla-timerlat-top.rst
@@ -30,102 +30,84 @@  OPTIONS
 
 .. include:: common_options.rst
 
+.. include:: common_timerlat_aa.rst
+
 EXAMPLE
 =======
 
-In the example below, the *timerlat* tracer is set to capture the stack trace at
-the IRQ handler, printing it to the buffer if the *Thread* timer latency is
-higher than *30 us*. It is also set to stop the session if a *Thread* timer
-latency higher than *30 us* is hit. Finally, it is set to save the trace
-buffer if the stop condition is hit::
+In the example below, the timerlat tracer is dispatched in cpus *1-23* in the
+automatic trace mode, instructing the tracer to stop if a *40 us* latency or
+higher is found::
 
-  [root@alien ~]# rtla timerlat top -s 30 -T 30 -t
-                   Timer Latency
-    0 00:00:59   |          IRQ Timer Latency (us)        |         Thread Timer Latency (us)
+  # timerlat -a 40 -c 1-23 -q
+                                     Timer Latency
+    0 00:00:12   |          IRQ Timer Latency (us)        |         Thread Timer Latency (us)
   CPU COUNT      |      cur       min       avg       max |      cur       min       avg       max
-    0 #58634     |        1         0         1        10 |       11         2        10        23
-    1 #58634     |        1         0         1         9 |       12         2         9        23
-    2 #58634     |        0         0         1        11 |       10         2         9        23
-    3 #58634     |        1         0         1        11 |       11         2         9        24
-    4 #58634     |        1         0         1        10 |       11         2         9        26
-    5 #58634     |        1         0         1         8 |       10         2         9        25
-    6 #58634     |       12         0         1        12 |       30         2        10        30 <--- CPU with spike
-    7 #58634     |        1         0         1         9 |       11         2         9        23
-    8 #58633     |        1         0         1         9 |       11         2         9        26
-    9 #58633     |        1         0         1         9 |       10         2         9        26
-   10 #58633     |        1         0         1        13 |       11         2         9        28
-   11 #58633     |        1         0         1        13 |       12         2         9        24
-   12 #58633     |        1         0         1         8 |       10         2         9        23
-   13 #58633     |        1         0         1        10 |       10         2         9        22
-   14 #58633     |        1         0         1        18 |       12         2         9        27
-   15 #58633     |        1         0         1        10 |       11         2         9        28
-   16 #58633     |        0         0         1        11 |        7         2         9        26
-   17 #58633     |        1         0         1        13 |       10         2         9        24
-   18 #58633     |        1         0         1         9 |       13         2         9        22
-   19 #58633     |        1         0         1        10 |       11         2         9        23
-   20 #58633     |        1         0         1        12 |       11         2         9        28
-   21 #58633     |        1         0         1        14 |       11         2         9        24
-   22 #58633     |        1         0         1         8 |       11         2         9        22
-   23 #58633     |        1         0         1        10 |       11         2         9        27
-  timerlat hit stop tracing
-  saving trace to timerlat_trace.txt
-  [root@alien bristot]# tail -60 timerlat_trace.txt
-  [...]
-      timerlat/5-79755   [005] .......   426.271226: #58634 context thread timer_latency     10823 ns
-              sh-109404  [006] dnLh213   426.271247: #58634 context    irq timer_latency     12505 ns
-              sh-109404  [006] dNLh313   426.271258: irq_noise: local_timer:236 start 426.271245463 duration 12553 ns
-              sh-109404  [006] d...313   426.271263: thread_noise:       sh:109404 start 426.271245853 duration 4769 ns
-      timerlat/6-79756   [006] .......   426.271264: #58634 context thread timer_latency     30328 ns
-      timerlat/6-79756   [006] ....1..   426.271265: <stack trace>
-  => timerlat_irq
-  => __hrtimer_run_queues
-  => hrtimer_interrupt
-  => __sysvec_apic_timer_interrupt
-  => sysvec_apic_timer_interrupt
-  => asm_sysvec_apic_timer_interrupt
-  => _raw_spin_unlock_irqrestore			<---- spinlock that disabled interrupt.
-  => try_to_wake_up
-  => autoremove_wake_function
-  => __wake_up_common
-  => __wake_up_common_lock
-  => ep_poll_callback
-  => __wake_up_common
-  => __wake_up_common_lock
-  => fsnotify_add_event
-  => inotify_handle_inode_event
-  => fsnotify
-  => __fsnotify_parent
-  => __fput
-  => task_work_run
-  => exit_to_user_mode_prepare
-  => syscall_exit_to_user_mode
-  => do_syscall_64
-  => entry_SYSCALL_64_after_hwframe
-  => 0x7265000001378c
-  => 0x10000cea7
-  => 0x25a00000204a
-  => 0x12e302d00000000
-  => 0x19b51010901b6
-  => 0x283ce00726500
-  => 0x61ea308872
-  => 0x00000fe3
-            bash-109109  [007] d..h...   426.271265: #58634 context    irq timer_latency      1211 ns
-      timerlat/6-79756   [006] .......   426.271267: timerlat_main: stop tracing hit on cpu 6
-
-In the trace, it is possible the notice that the *IRQ* timer latency was
-already high, accounting *12505 ns*. The IRQ delay was caused by the
-*bash-109109* process that disabled IRQs in the wake-up path
-(*_try_to_wake_up()* function). The duration of the IRQ handler that woke
-up the timerlat thread, informed with the **osnoise:irq_noise** event, was
-also high and added more *12553 ns* to the Thread latency. Finally, the
-**osnoise:thread_noise** added by the currently running thread (including
-the scheduling overhead) added more *4769 ns*. Summing up these values,
-the *Thread* timer latency accounted for *30328 ns*.
-
-The primary reason for this high value is the wake-up path that was hit
-twice during this case: when the *bash-109109* was waking up a thread
-and then when the *timerlat* thread was awakened. This information can
-then be used as the starting point of a more fine-grained analysis.
+    1 #12322     |        0         0         1        15 |       10         3         9        31
+    2 #12322     |        3         0         1        12 |       10         3         9        23
+    3 #12322     |        1         0         1        21 |        8         2         8        34
+    4 #12322     |        1         0         1        17 |       10         2        11        33
+    5 #12322     |        0         0         1        12 |        8         3         8        25
+    6 #12322     |        1         0         1        14 |       16         3        11        35
+    7 #12322     |        0         0         1        14 |        9         2         8        29
+    8 #12322     |        1         0         1        22 |        9         3         9        34
+    9 #12322     |        0         0         1        14 |        8         2         8        24
+   10 #12322     |        1         0         0        12 |        9         3         8        24
+   11 #12322     |        0         0         0        15 |        6         2         7        29
+   12 #12321     |        1         0         0        13 |        5         3         8        23
+   13 #12319     |        0         0         1        14 |        9         3         9        26
+   14 #12321     |        1         0         0        13 |        6         2         8        24
+   15 #12321     |        1         0         1        15 |       12         3        11        27
+   16 #12318     |        0         0         1        13 |        7         3        10        24
+   17 #12319     |        0         0         1        13 |       11         3         9        25
+   18 #12318     |        0         0         0        12 |        8         2         8        20
+   19 #12319     |        0         0         1        18 |       10         2         9        28
+   20 #12317     |        0         0         0        20 |        9         3         8        34
+   21 #12318     |        0         0         0        13 |        8         3         8        28
+   22 #12319     |        0         0         1        11 |        8         3        10        22
+   23 #12320     |       28         0         1        28 |       41         3        11        41
+  rtla timerlat hit stop tracing
+  ## CPU 23 hit stop tracing, analyzing it ##
+  IRQ handler delay:                                        27.49 us (65.52 %)
+  IRQ latency:                                              28.13 us
+  Timerlat IRQ duration:                                     9.59 us (22.85 %)
+  Blocking thread:                                           3.79 us (9.03 %)
+                         objtool:49256                       3.79 us
+    Blocking thread stacktrace
+                -> timerlat_irq
+                -> __hrtimer_run_queues
+                -> hrtimer_interrupt
+                -> __sysvec_apic_timer_interrupt
+                -> sysvec_apic_timer_interrupt
+                -> asm_sysvec_apic_timer_interrupt
+                -> _raw_spin_unlock_irqrestore
+                -> cgroup_rstat_flush_locked
+                -> cgroup_rstat_flush_irqsafe
+                -> mem_cgroup_flush_stats
+                -> mem_cgroup_wb_stats
+                -> balance_dirty_pages
+                -> balance_dirty_pages_ratelimited_flags
+                -> btrfs_buffered_write
+                -> btrfs_do_write_iter
+                -> vfs_write
+                -> __x64_sys_pwrite64
+                -> do_syscall_64
+                -> entry_SYSCALL_64_after_hwframe
+  ------------------------------------------------------------------------
+    Thread latency:                                          41.96 us (100%)
+
+  The system has exit from idle latency!
+    Max timerlat IRQ latency from idle: 17.48 us in cpu 4
+  Saving trace to timerlat_trace.txt
+
+In this case, the major factor was the delay suffered by the *IRQ handler*
+that handles **timerlat** wakeup: *65.52%*. This can be caused by the
+current thread masking interrupts, which can be seen in the blocking
+thread stacktrace: the current thread (*objtool:49256*) disabled interrupts
+via *raw spin lock* operations inside mem cgroup, while doing write
+syscall in a btrfs file system.
+
+The raw trace is saved in the **timerlat_trace.txt** file for further analysis.
 
 Note that **rtla timerlat** was dispatched without changing *timerlat* tracer
 threads' priority. That is generally not needed because these threads hava