Skip to content

Commit

Permalink
Merge tag 'trace-tools-v6.3' of git://git.kernel.org/pub/scm/linux/ke…
Browse files Browse the repository at this point in the history
…rnel/git/trace/linux-trace

Pull tracing tools updates from Steven Rostedt:

 - Use total duration to calculate average in rtla osnoise_hist

 - Use 2 digit precision for displaying average

 - Print an intuitive auto analysis of timerlat results

 - Add auto analysis to timerlat top

 - Add hwnoise, which is the same as osnoise but focuses on hardware

 - Small clean ups

* tag 'trace-tools-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  Documentation/rtla: Add hwnoise man page
  rtla: Add hwnoise tool
  Documentation/rtla: Add timerlat-top auto-analysis options
  rtla/timerlat: Add auto-analysis support to timerlat top
  rtla/timerlat: Add auto-analysis core
  tools/tracing/rtla: osnoise_hist: display average with two-digit precision
  tools/tracing/rtla: osnoise_hist: use total duration for average calculation
  tools/rv: Remove unneeded semicolon
  • Loading branch information
torvalds committed Feb 23, 2023
2 parents 2562af6 + 5dc3750 commit d392e49
Show file tree
Hide file tree
Showing 15 changed files with 1,442 additions and 113 deletions.
7 changes: 7 additions & 0 deletions Documentation/tools/rtla/common_timerlat_aa.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
**--dump-tasks**

prints the task running on all CPUs if stop conditions are met (depends on !--no-aa)

**--no-aa**

disable auto-analysis, reducing rtla timerlat cpu usage
1 change: 1 addition & 0 deletions Documentation/tools/rtla/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ behavior on specific hardware.
rtla-timerlat
rtla-timerlat-hist
rtla-timerlat-top
rtla-hwnoise

.. only:: subproject and html

Expand Down
107 changes: 107 additions & 0 deletions Documentation/tools/rtla/rtla-hwnoise.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
.. SPDX-License-Identifier: GPL-2.0
============
rtla-hwnoise
============
------------------------------------------
Detect and quantify hardware-related noise
------------------------------------------

:Manual section: 1

SYNOPSIS
========

**rtla hwnoise** [*OPTIONS*]

DESCRIPTION
===========

**rtla hwnoise** collects the periodic summary from the *osnoise* tracer
running with *interrupts disabled*. By disabling interrupts, and the scheduling
of threads as a consequence, only non-maskable interrupts and hardware-related
noise is allowed.

The tool also allows the configurations of the *osnoise* tracer and the
collection of the tracer output.

OPTIONS
=======
.. include:: common_osnoise_options.rst

.. include:: common_top_options.rst

.. include:: common_options.rst

EXAMPLE
=======
In the example below, the **rtla hwnoise** tool is set to run on CPUs *1-7*
on a system with 8 cores/16 threads with hyper-threading enabled.

The tool is set to detect any noise higher than *one microsecond*,
to run for *ten minutes*, displaying a summary of the report at the
end of the session::

# rtla hwnoise -c 1-7 -T 1 -d 10m -q
Hardware-related Noise
duration: 0 00:10:00 | time is in us
CPU Period Runtime Noise % CPU Aval Max Noise Max Single HW NMI
1 #599 599000000 138 99.99997 3 3 4 74
2 #599 599000000 85 99.99998 3 3 4 75
3 #599 599000000 86 99.99998 4 3 6 75
4 #599 599000000 81 99.99998 4 4 2 75
5 #599 599000000 85 99.99998 2 2 2 75
6 #599 599000000 76 99.99998 2 2 0 75
7 #599 599000000 77 99.99998 3 3 0 75


The first column shows the *CPU*, and the second column shows how many
*Periods* the tool ran during the session. The *Runtime* is the time
the tool effectively runs on the CPU. The *Noise* column is the sum of
all noise that the tool observed, and the *% CPU Aval* is the relation
between the *Runtime* and *Noise*.

The *Max Noise* column is the maximum hardware noise the tool detected in a
single period, and the *Max Single* is the maximum single noise seen.

The *HW* and *NMI* columns show the total number of *hardware* and *NMI* noise
occurrence observed by the tool.

For example, *CPU 3* ran *599* periods of *1 second Runtime*. The CPU received
*86 us* of noise during the entire execution, leaving *99.99997 %* of CPU time
for the application. In the worst single period, the CPU caused *4 us* of
noise to the application, but it was certainly caused by more than one single
noise, as the *Max Single* noise was of *3 us*. The CPU has *HW noise,* at a
rate of *six occurrences*/*ten minutes*. The CPU also has *NMIs*, at a higher
frequency: around *seven per second*.

The tool should report *0* hardware-related noise in the ideal situation.
For example, by disabling hyper-threading to remove the hardware noise,
and disabling the TSC watchdog to remove the NMI (it is possible to identify
this using tracing options of **rtla hwnoise**), it was possible to reach
the ideal situation in the same hardware::

# rtla hwnoise -c 1-7 -T 1 -d 10m -q
Hardware-related Noise
duration: 0 00:10:00 | time is in us
CPU Period Runtime Noise % CPU Aval Max Noise Max Single HW NMI
1 #599 599000000 0 100.00000 0 0 0 0
2 #599 599000000 0 100.00000 0 0 0 0
3 #599 599000000 0 100.00000 0 0 0 0
4 #599 599000000 0 100.00000 0 0 0 0
5 #599 599000000 0 100.00000 0 0 0 0
6 #599 599000000 0 100.00000 0 0 0 0
7 #599 599000000 0 100.00000 0 0 0 0

SEE ALSO
========

**rtla-osnoise**\(1)

Osnoise tracer documentation: <https://www.kernel.org/doc/html/latest/trace/osnoise-tracer.html>

AUTHOR
======
Written by Daniel Bristot de Oliveira <bristot@kernel.org>

.. include:: common_appendix.rst
164 changes: 73 additions & 91 deletions Documentation/tools/rtla/rtla-timerlat-top.rst
Original file line number Diff line number Diff line change
Expand Up @@ -30,102 +30,84 @@ OPTIONS

.. include:: common_options.rst

.. include:: common_timerlat_aa.rst

EXAMPLE
=======

In the example below, the *timerlat* tracer is set to capture the stack trace at
the IRQ handler, printing it to the buffer if the *Thread* timer latency is
higher than *30 us*. It is also set to stop the session if a *Thread* timer
latency higher than *30 us* is hit. Finally, it is set to save the trace
buffer if the stop condition is hit::
In the example below, the timerlat tracer is dispatched in cpus *1-23* in the
automatic trace mode, instructing the tracer to stop if a *40 us* latency or
higher is found::

[root@alien ~]# rtla timerlat top -s 30 -T 30 -t
Timer Latency
0 00:00:59 | IRQ Timer Latency (us) | Thread Timer Latency (us)
# timerlat -a 40 -c 1-23 -q
Timer Latency
0 00:00:12 | IRQ Timer Latency (us) | Thread Timer Latency (us)
CPU COUNT | cur min avg max | cur min avg max
0 #58634 | 1 0 1 10 | 11 2 10 23
1 #58634 | 1 0 1 9 | 12 2 9 23
2 #58634 | 0 0 1 11 | 10 2 9 23
3 #58634 | 1 0 1 11 | 11 2 9 24
4 #58634 | 1 0 1 10 | 11 2 9 26
5 #58634 | 1 0 1 8 | 10 2 9 25
6 #58634 | 12 0 1 12 | 30 2 10 30 <--- CPU with spike
7 #58634 | 1 0 1 9 | 11 2 9 23
8 #58633 | 1 0 1 9 | 11 2 9 26
9 #58633 | 1 0 1 9 | 10 2 9 26
10 #58633 | 1 0 1 13 | 11 2 9 28
11 #58633 | 1 0 1 13 | 12 2 9 24
12 #58633 | 1 0 1 8 | 10 2 9 23
13 #58633 | 1 0 1 10 | 10 2 9 22
14 #58633 | 1 0 1 18 | 12 2 9 27
15 #58633 | 1 0 1 10 | 11 2 9 28
16 #58633 | 0 0 1 11 | 7 2 9 26
17 #58633 | 1 0 1 13 | 10 2 9 24
18 #58633 | 1 0 1 9 | 13 2 9 22
19 #58633 | 1 0 1 10 | 11 2 9 23
20 #58633 | 1 0 1 12 | 11 2 9 28
21 #58633 | 1 0 1 14 | 11 2 9 24
22 #58633 | 1 0 1 8 | 11 2 9 22
23 #58633 | 1 0 1 10 | 11 2 9 27
timerlat hit stop tracing
saving trace to timerlat_trace.txt
[root@alien bristot]# tail -60 timerlat_trace.txt
[...]
timerlat/5-79755 [005] ....... 426.271226: #58634 context thread timer_latency 10823 ns
sh-109404 [006] dnLh213 426.271247: #58634 context irq timer_latency 12505 ns
sh-109404 [006] dNLh313 426.271258: irq_noise: local_timer:236 start 426.271245463 duration 12553 ns
sh-109404 [006] d...313 426.271263: thread_noise: sh:109404 start 426.271245853 duration 4769 ns
timerlat/6-79756 [006] ....... 426.271264: #58634 context thread timer_latency 30328 ns
timerlat/6-79756 [006] ....1.. 426.271265: <stack trace>
=> timerlat_irq
=> __hrtimer_run_queues
=> hrtimer_interrupt
=> __sysvec_apic_timer_interrupt
=> sysvec_apic_timer_interrupt
=> asm_sysvec_apic_timer_interrupt
=> _raw_spin_unlock_irqrestore <---- spinlock that disabled interrupt.
=> try_to_wake_up
=> autoremove_wake_function
=> __wake_up_common
=> __wake_up_common_lock
=> ep_poll_callback
=> __wake_up_common
=> __wake_up_common_lock
=> fsnotify_add_event
=> inotify_handle_inode_event
=> fsnotify
=> __fsnotify_parent
=> __fput
=> task_work_run
=> exit_to_user_mode_prepare
=> syscall_exit_to_user_mode
=> do_syscall_64
=> entry_SYSCALL_64_after_hwframe
=> 0x7265000001378c
=> 0x10000cea7
=> 0x25a00000204a
=> 0x12e302d00000000
=> 0x19b51010901b6
=> 0x283ce00726500
=> 0x61ea308872
=> 0x00000fe3
bash-109109 [007] d..h... 426.271265: #58634 context irq timer_latency 1211 ns
timerlat/6-79756 [006] ....... 426.271267: timerlat_main: stop tracing hit on cpu 6

In the trace, it is possible the notice that the *IRQ* timer latency was
already high, accounting *12505 ns*. The IRQ delay was caused by the
*bash-109109* process that disabled IRQs in the wake-up path
(*_try_to_wake_up()* function). The duration of the IRQ handler that woke
up the timerlat thread, informed with the **osnoise:irq_noise** event, was
also high and added more *12553 ns* to the Thread latency. Finally, the
**osnoise:thread_noise** added by the currently running thread (including
the scheduling overhead) added more *4769 ns*. Summing up these values,
the *Thread* timer latency accounted for *30328 ns*.

The primary reason for this high value is the wake-up path that was hit
twice during this case: when the *bash-109109* was waking up a thread
and then when the *timerlat* thread was awakened. This information can
then be used as the starting point of a more fine-grained analysis.
1 #12322 | 0 0 1 15 | 10 3 9 31
2 #12322 | 3 0 1 12 | 10 3 9 23
3 #12322 | 1 0 1 21 | 8 2 8 34
4 #12322 | 1 0 1 17 | 10 2 11 33
5 #12322 | 0 0 1 12 | 8 3 8 25
6 #12322 | 1 0 1 14 | 16 3 11 35
7 #12322 | 0 0 1 14 | 9 2 8 29
8 #12322 | 1 0 1 22 | 9 3 9 34
9 #12322 | 0 0 1 14 | 8 2 8 24
10 #12322 | 1 0 0 12 | 9 3 8 24
11 #12322 | 0 0 0 15 | 6 2 7 29
12 #12321 | 1 0 0 13 | 5 3 8 23
13 #12319 | 0 0 1 14 | 9 3 9 26
14 #12321 | 1 0 0 13 | 6 2 8 24
15 #12321 | 1 0 1 15 | 12 3 11 27
16 #12318 | 0 0 1 13 | 7 3 10 24
17 #12319 | 0 0 1 13 | 11 3 9 25
18 #12318 | 0 0 0 12 | 8 2 8 20
19 #12319 | 0 0 1 18 | 10 2 9 28
20 #12317 | 0 0 0 20 | 9 3 8 34
21 #12318 | 0 0 0 13 | 8 3 8 28
22 #12319 | 0 0 1 11 | 8 3 10 22
23 #12320 | 28 0 1 28 | 41 3 11 41
rtla timerlat hit stop tracing
## CPU 23 hit stop tracing, analyzing it ##
IRQ handler delay: 27.49 us (65.52 %)
IRQ latency: 28.13 us
Timerlat IRQ duration: 9.59 us (22.85 %)
Blocking thread: 3.79 us (9.03 %)
objtool:49256 3.79 us
Blocking thread stacktrace
-> timerlat_irq
-> __hrtimer_run_queues
-> hrtimer_interrupt
-> __sysvec_apic_timer_interrupt
-> sysvec_apic_timer_interrupt
-> asm_sysvec_apic_timer_interrupt
-> _raw_spin_unlock_irqrestore
-> cgroup_rstat_flush_locked
-> cgroup_rstat_flush_irqsafe
-> mem_cgroup_flush_stats
-> mem_cgroup_wb_stats
-> balance_dirty_pages
-> balance_dirty_pages_ratelimited_flags
-> btrfs_buffered_write
-> btrfs_do_write_iter
-> vfs_write
-> __x64_sys_pwrite64
-> do_syscall_64
-> entry_SYSCALL_64_after_hwframe
------------------------------------------------------------------------
Thread latency: 41.96 us (100%)

The system has exit from idle latency!
Max timerlat IRQ latency from idle: 17.48 us in cpu 4
Saving trace to timerlat_trace.txt

In this case, the major factor was the delay suffered by the *IRQ handler*
that handles **timerlat** wakeup: *65.52%*. This can be caused by the
current thread masking interrupts, which can be seen in the blocking
thread stacktrace: the current thread (*objtool:49256*) disabled interrupts
via *raw spin lock* operations inside mem cgroup, while doing write
syscall in a btrfs file system.

The raw trace is saved in the **timerlat_trace.txt** file for further analysis.

Note that **rtla timerlat** was dispatched without changing *timerlat* tracer
threads' priority. That is generally not needed because these threads hava
Expand Down
2 changes: 2 additions & 0 deletions tools/tracing/rtla/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -119,6 +119,8 @@ install: doc_install
$(STRIP) $(DESTDIR)$(BINDIR)/rtla
@test ! -f $(DESTDIR)$(BINDIR)/osnoise || rm $(DESTDIR)$(BINDIR)/osnoise
ln -s rtla $(DESTDIR)$(BINDIR)/osnoise
@test ! -f $(DESTDIR)$(BINDIR)/hwnoise || rm $(DESTDIR)$(BINDIR)/hwnoise
ln -s rtla $(DESTDIR)$(BINDIR)/hwnoise
@test ! -f $(DESTDIR)$(BINDIR)/timerlat || rm $(DESTDIR)$(BINDIR)/timerlat
ln -s rtla $(DESTDIR)$(BINDIR)/timerlat

Expand Down
Loading

0 comments on commit d392e49

Please sign in to comment.