#
bce29ac9 |
| 22-Jun-2021 |
Daniel Bristot de Oliveira <bristot@redhat.com> |
trace: Add osnoise tracer
In the context of high-performance computing (HPC), the Operating System Noise (*osnoise*) refers to the interference experienced by an application due to activities inside
trace: Add osnoise tracer
In the context of high-performance computing (HPC), the Operating System Noise (*osnoise*) refers to the interference experienced by an application due to activities inside the operating system. In the context of Linux, NMIs, IRQs, SoftIRQs, and any other system thread can cause noise to the system. Moreover, hardware-related jobs can also cause noise, for example, via SMIs.
The osnoise tracer leverages the hwlat_detector by running a similar loop with preemption, SoftIRQs and IRQs enabled, thus allowing all the sources of *osnoise* during its execution. Using the same approach of hwlat, osnoise takes note of the entry and exit point of any source of interferences, increasing a per-cpu interference counter. The osnoise tracer also saves an interference counter for each source of interference. The interference counter for NMI, IRQs, SoftIRQs, and threads is increased anytime the tool observes these interferences' entry events. When a noise happens without any interference from the operating system level, the hardware noise counter increases, pointing to a hardware-related noise. In this way, osnoise can account for any source of interference. At the end of the period, the osnoise tracer prints the sum of all noise, the max single noise, the percentage of CPU available for the thread, and the counters for the noise sources.
Usage
Write the ASCII text "osnoise" into the current_tracer file of the tracing system (generally mounted at /sys/kernel/tracing).
For example::
[root@f32 ~]# cd /sys/kernel/tracing/ [root@f32 tracing]# echo osnoise > current_tracer
It is possible to follow the trace by reading the trace trace file::
[root@f32 tracing]# cat trace # tracer: osnoise # # _-----=> irqs-off # / _----=> need-resched # | / _---=> hardirq/softirq # || / _--=> preempt-depth MAX # || / SINGLE Interference counters: # |||| RUNTIME NOISE % OF CPU NOISE +-----------------------------+ # TASK-PID CPU# |||| TIMESTAMP IN US IN US AVAILABLE IN US HW NMI IRQ SIRQ THREAD # | | | |||| | | | | | | | | | | <...>-859 [000] .... 81.637220: 1000000 190 99.98100 9 18 0 1007 18 1 <...>-860 [001] .... 81.638154: 1000000 656 99.93440 74 23 0 1006 16 3 <...>-861 [002] .... 81.638193: 1000000 5675 99.43250 202 6 0 1013 25 21 <...>-862 [003] .... 81.638242: 1000000 125 99.98750 45 1 0 1011 23 0 <...>-863 [004] .... 81.638260: 1000000 1721 99.82790 168 7 0 1002 49 41 <...>-864 [005] .... 81.638286: 1000000 263 99.97370 57 6 0 1006 26 2 <...>-865 [006] .... 81.638302: 1000000 109 99.98910 21 3 0 1006 18 1 <...>-866 [007] .... 81.638326: 1000000 7816 99.21840 107 8 0 1016 39 19
In addition to the regular trace fields (from TASK-PID to TIMESTAMP), the tracer prints a message at the end of each period for each CPU that is running an osnoise/CPU thread. The osnoise specific fields report:
- The RUNTIME IN USE reports the amount of time in microseconds that the osnoise thread kept looping reading the time. - The NOISE IN US reports the sum of noise in microseconds observed by the osnoise tracer during the associated runtime. - The % OF CPU AVAILABLE reports the percentage of CPU available for the osnoise thread during the runtime window. - The MAX SINGLE NOISE IN US reports the maximum single noise observed during the runtime window. - The Interference counters display how many each of the respective interference happened during the runtime window.
Note that the example above shows a high number of HW noise samples. The reason being is that this sample was taken on a virtual machine, and the host interference is detected as a hardware interference.
Tracer options
The tracer has a set of options inside the osnoise directory, they are:
- osnoise/cpus: CPUs at which a osnoise thread will execute. - osnoise/period_us: the period of the osnoise thread. - osnoise/runtime_us: how long an osnoise thread will look for noise. - osnoise/stop_tracing_us: stop the system tracing if a single noise higher than the configured value happens. Writing 0 disables this option. - osnoise/stop_tracing_total_us: stop the system tracing if total noise higher than the configured value happens. Writing 0 disables this option. - tracing_threshold: the minimum delta between two time() reads to be considered as noise, in us. When set to 0, the default value will be used, which is currently 5 us.
Additional Tracing
In addition to the tracer, a set of tracepoints were added to facilitate the identification of the osnoise source.
- osnoise:sample_threshold: printed anytime a noise is higher than the configurable tolerance_ns. - osnoise:nmi_noise: noise from NMI, including the duration. - osnoise:irq_noise: noise from an IRQ, including the duration. - osnoise:softirq_noise: noise from a SoftIRQ, including the duration. - osnoise:thread_noise: noise from a thread, including the duration.
Note that all the values are *net values*. For example, if while osnoise is running, another thread preempts the osnoise thread, it will start a thread_noise duration at the start. Then, an IRQ takes place, preempting the thread_noise, starting a irq_noise. When the IRQ ends its execution, it will compute its duration, and this duration will be subtracted from the thread_noise, in such a way as to avoid the double accounting of the IRQ execution. This logic is valid for all sources of noise.
Here is one example of the usage of these tracepoints::
osnoise/8-961 [008] d.h. 5789.857532: irq_noise: local_timer:236 start 5789.857529929 duration 1845 ns osnoise/8-961 [008] dNh. 5789.858408: irq_noise: local_timer:236 start 5789.858404871 duration 2848 ns migration/8-54 [008] d... 5789.858413: thread_noise: migration/8:54 start 5789.858409300 duration 3068 ns osnoise/8-961 [008] .... 5789.858413: sample_threshold: start 5789.858404555 duration 8723 ns interferences 2
In this example, a noise sample of 8 microseconds was reported in the last line, pointing to two interferences. Looking backward in the trace, the two previous entries were about the migration thread running after a timer IRQ execution. The first event is not part of the noise because it took place one millisecond before.
It is worth noticing that the sum of the duration reported in the tracepoints is smaller than eight us reported in the sample_threshold. The reason roots in the overhead of the entry and exit code that happens before and after any interference execution. This justifies the dual approach: measuring thread and tracing.
Link: https://lkml.kernel.org/r/e649467042d60e7b62714c9c6751a56299d15119.1624372313.git.bristot@redhat.com
Cc: Phil Auld <pauld@redhat.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Kate Carcia <kcarcia@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Alexandre Chartre <alexandre.chartre@oracle.com> Cc: Clark Willaims <williams@redhat.com> Cc: John Kacur <jkacur@redhat.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Cc: linux-doc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com> [ Made the following functions static: trace_irqentry_callback() trace_irqexit_callback() trace_intel_irqentry_callback() trace_intel_irqexit_callback()
Added to include/trace.h: osnoise_arch_register() osnoise_arch_unregister()
Fixed define logic for LATENCY_FS_NOTIFY
Reported-by: kernel test robot <lkp@intel.com> ] Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
show more ...
|
Revision tags: v5.10.43 |
|
#
c441bfb5 |
| 09-Jun-2021 |
Mark Brown <broonie@kernel.org> |
Merge tag 'v5.13-rc3' into asoc-5.13
Linux 5.13-rc3
|
Revision tags: v5.10.42 |
|
#
942baad2 |
| 02-Jun-2021 |
Joonas Lahtinen <joonas.lahtinen@linux.intel.com> |
Merge drm/drm-next into drm-intel-gt-next
Pulling in -rc2 fixes and TTM changes that next upcoming patches depend on.
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
Revision tags: v5.10.41, v5.10.40, v5.10.39 |
|
#
c37fe6af |
| 18-May-2021 |
Mark Brown <broonie@kernel.org> |
Merge tag 'v5.13-rc2' into spi-5.13
Linux 5.13-rc2
|
#
85ebe5ae |
| 18-May-2021 |
Tony Lindgren <tony@atomide.com> |
Merge branch 'fixes-rc1' into fixes
|
#
d22fe808 |
| 17-May-2021 |
Rodrigo Vivi <rodrigo.vivi@intel.com> |
Merge drm/drm-next into drm-intel-next
Time to get back in sync...
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
#
a4345a7c |
| 17-May-2021 |
Paolo Bonzini <pbonzini@redhat.com> |
Merge tag 'kvmarm-fixes-5.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 fixes for 5.13, take #1
- Fix regression with irqbypass not restarting the guest o
Merge tag 'kvmarm-fixes-5.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 fixes for 5.13, take #1
- Fix regression with irqbypass not restarting the guest on failed connect - Fix regression with debug register decoding resulting in overlapping access - Commit exception state on exit to usrspace - Fix the MMU notifier return values - Add missing 'static' qualifiers in the new host stage-2 code
show more ...
|
Revision tags: v5.4.119 |
|
#
fd531024 |
| 11-May-2021 |
Thomas Zimmermann <tzimmermann@suse.de> |
Merge drm/drm-next into drm-misc-next
Backmerging to get v5.12 fixes. Requested for vmwgfx.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
|
Revision tags: v5.10.36 |
|
#
c55b44c9 |
| 11-May-2021 |
Maxime Ripard <maxime@cerno.tech> |
Merge drm/drm-fixes into drm-misc-fixes
Start this new release drm-misc-fixes branch
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
|
#
f96271ce |
| 08-May-2021 |
Michael Ellerman <mpe@ellerman.id.au> |
Merge branch 'master' into next
Merge master back into next, this allows us to resolve some conflicts in arch/powerpc/Kconfig, and also re-sort the symbols under config PPC so that they are in alpha
Merge branch 'master' into next
Merge master back into next, this allows us to resolve some conflicts in arch/powerpc/Kconfig, and also re-sort the symbols under config PPC so that they are in alphabetical order again.
show more ...
|
Revision tags: v5.10.35 |
|
#
9b1f61d5 |
| 03-May-2021 |
Linus Torvalds <torvalds@linux-foundation.org> |
Merge tag 'trace-v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing updates from Steven Rostedt: "New feature:
- A new "func-no-repeats" option in tracefs/
Merge tag 'trace-v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing updates from Steven Rostedt: "New feature:
- A new "func-no-repeats" option in tracefs/options directory.
When set the function tracer will detect if the current function being traced is the same as the previous one, and instead of recording it, it will keep track of the number of times that the function is repeated in a row. And when another function is recorded, it will write a new event that shows the function that repeated, the number of times it repeated and the time stamp of when the last repeated function occurred.
Enhancements:
- In order to implement the above "func-no-repeats" option, the ring buffer timestamp can now give the accurate timestamp of the event as it is being recorded, instead of having to record an absolute timestamp for all events. This helps the histogram code which no longer needs to waste ring buffer space.
- New validation logic to make sure all trace events that access dereferenced pointers do so in a safe way, and will warn otherwise.
Fixes:
- No longer limit the PIDs of tasks that are recorded for "saved_cmdlines" to PID_MAX_DEFAULT (32768), as systemd now allows for a much larger range. This caused the mapping of PIDs to the task names to be dropped for all tasks with a PID greater than 32768.
- Change trace_clock_global() to never block. This caused a deadlock.
Clean ups:
- Typos, prototype fixes, and removing of duplicate or unused code.
- Better management of ftrace_page allocations"
* tag 'trace-v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (32 commits) tracing: Restructure trace_clock_global() to never block tracing: Map all PIDs to command lines ftrace: Reuse the output of the function tracer for func_repeats tracing: Add "func_no_repeats" option for function tracing tracing: Unify the logic for function tracing options tracing: Add method for recording "func_repeats" events tracing: Add "last_func_repeats" to struct trace_array tracing: Define new ftrace event "func_repeats" tracing: Define static void trace_print_time() ftrace: Simplify the calculation of page number for ftrace_page->records some more ftrace: Store the order of pages allocated in ftrace_page tracing: Remove unused argument from "ring_buffer_time_stamp() tracing: Remove duplicate struct declaration in trace_events.h tracing: Update create_system_filter() kernel-doc comment tracing: A minor cleanup for create_system_filter() kernel: trace: Mundane typo fixes in the file trace_events_filter.c tracing: Fix various typos in comments scripts/recordmcount.pl: Make vim and emacs indent the same scripts/recordmcount.pl: Make indent spacing consistent tracing: Add a verifier to check string pointers for trace events ...
show more ...
|
Revision tags: v5.10.34, v5.4.116, v5.10.33, v5.12, v5.10.32, v5.10.31 |
|
#
f689e4f2 |
| 15-Apr-2021 |
Yordan Karadzhov (VMware) <y.karadz@gmail.com> |
tracing: Define new ftrace event "func_repeats"
The event aims to consolidate the function tracing record in the cases when a single function is called number of times consecutively.
while (cond)
tracing: Define new ftrace event "func_repeats"
The event aims to consolidate the function tracing record in the cases when a single function is called number of times consecutively.
while (cond) do_func();
This may happen in various scenarios (busy waiting for example). The new ftrace event can be used to show repeated function events with a single event and save space on the ring buffer
Link: https://lkml.kernel.org/r/20210415181854.147448-3-y.karadz@gmail.com
Signed-off-by: Yordan Karadzhov (VMware) <y.karadz@gmail.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
show more ...
|
Revision tags: v5.10.30, v5.10.27, v5.10.26, v5.10.25, v5.10.24, v5.10.23, v5.10.22, v5.10.21, v5.10.20, v5.10.19, v5.4.101 |
|
#
cdd38c5f |
| 24-Feb-2021 |
Stefan Schmidt <stefan@datenfreihafen.org> |
Merge remote-tracking branch 'net/master'
|
Revision tags: v5.10.18 |
|
#
d6310078 |
| 23-Feb-2021 |
Jiri Kosina <jkosina@suse.cz> |
Merge branch 'for-5.12/google' into for-linus
- User experience improvements for hid-google from Nicolas Boichat
|
#
cbecf716 |
| 22-Feb-2021 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge branch 'next' into for-linus
Prepare input updates for 5.12 merge window.
|
#
415e915f |
| 22-Feb-2021 |
Dmitry Torokhov <dmitry.torokhov@gmail.com> |
Merge tag 'v5.11' into next
Merge with mainline to get latest APIs and device tree bindings.
|
Revision tags: v5.10.17, v5.11, v5.10.16, v5.10.15, v5.10.14 |
|
#
9d4d8572 |
| 02-Feb-2021 |
Russell King <rmk+kernel@armlinux.org.uk> |
Merge tag 'amba-make-remove-return-void' of https://git.pengutronix.de/git/ukl/linux into devel-stable
Tag for adaptions to struct amba_driver::remove changing prototype
|
#
715a1284 |
| 15-Jan-2021 |
Tony Lindgren <tony@atomide.com> |
Merge branch 'cpuidle-fix' into fixes
|
#
d263dfa7 |
| 15-Jan-2021 |
Joonas Lahtinen <joonas.lahtinen@linux.intel.com> |
Merge drm/drm-next into drm-intel-gt-next
Backmerging to get a common base for merging topic branches between drm-intel-next and drm-intel-gt-next.
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@l
Merge drm/drm-next into drm-intel-gt-next
Backmerging to get a common base for merging topic branches between drm-intel-next and drm-intel-gt-next.
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
show more ...
|
#
10205618 |
| 08-Jan-2021 |
Rodrigo Vivi <rodrigo.vivi@intel.com> |
Merge drm/drm-next into drm-intel-next
sync-up to not fall too much behind.
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
#
6dcb8bf9 |
| 07-Jan-2021 |
Takashi Iwai <tiwai@suse.de> |
Merge branch 'for-linus' into for-next
Back-merge of 5.11-devel branch for syncing the result changes.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
#
7b622755 |
| 07-Jan-2021 |
Takashi Iwai <tiwai@suse.de> |
Merge tag 'asoc-fix-v5.11-rc2' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v5.11
A collection of mostly driver specific fixes, plus a maintainers
Merge tag 'asoc-fix-v5.11-rc2' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v5.11
A collection of mostly driver specific fixes, plus a maintainership update for TI and a fix for DAPM driver removal paths.
show more ...
|
#
2313f470 |
| 07-Jan-2021 |
Maarten Lankhorst <maarten.lankhorst@linux.intel.com> |
Merge drm/drm-next into drm-misc-next
Staying in sync to drm-next, and to be able to pull ttm fixes.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
|
#
8db90aa3 |
| 28-Dec-2020 |
Mark Brown <broonie@kernel.org> |
Merge tag 'v5.11-rc1' into spi-5.11
Linux 5.11-rc1
|
#
2ae6f64c |
| 28-Dec-2020 |
Mark Brown <broonie@kernel.org> |
Merge tag 'v5.11-rc1' into regulator-5.11
Linux 5.11-rc1
|