1============== 2OSNOISE Tracer 3============== 4 5In the context of high-performance computing (HPC), the Operating System 6Noise (*osnoise*) refers to the interference experienced by an application 7due to activities inside the operating system. In the context of Linux, 8NMIs, IRQs, SoftIRQs, and any other system thread can cause noise to the 9system. Moreover, hardware-related jobs can also cause noise, for example, 10via SMIs. 11 12hwlat_detector is one of the tools used to identify the most complex 13source of noise: *hardware noise*. 14 15In a nutshell, the hwlat_detector creates a thread that runs 16periodically for a given period. At the beginning of a period, the thread 17disables interrupt and starts sampling. While running, the hwlatd 18thread reads the time in a loop. As interrupts are disabled, threads, 19IRQs, and SoftIRQs cannot interfere with the hwlatd thread. Hence, the 20cause of any gap between two different reads of the time roots either on 21NMI or in the hardware itself. At the end of the period, hwlatd enables 22interrupts and reports the max observed gap between the reads. It also 23prints a NMI occurrence counter. If the output does not report NMI 24executions, the user can conclude that the hardware is the culprit for 25the latency. The hwlat detects the NMI execution by observing 26the entry and exit of a NMI. 27 28The osnoise tracer leverages the hwlat_detector by running a 29similar loop with preemption, SoftIRQs and IRQs enabled, thus allowing 30all the sources of *osnoise* during its execution. Using the same approach 31of hwlat, osnoise takes note of the entry and exit point of any 32source of interferences, increasing a per-cpu interference counter. The 33osnoise tracer also saves an interference counter for each source of 34interference. The interference counter for NMI, IRQs, SoftIRQs, and 35threads is increased anytime the tool observes these interferences' entry 36events. When a noise happens without any interference from the operating 37system level, the hardware noise counter increases, pointing to a 38hardware-related noise. In this way, osnoise can account for any 39source of interference. At the end of the period, the osnoise tracer 40prints the sum of all noise, the max single noise, the percentage of CPU 41available for the thread, and the counters for the noise sources. 42 43Usage 44----- 45 46Write the ASCII text "osnoise" into the current_tracer file of the 47tracing system (generally mounted at /sys/kernel/tracing). 48 49For example:: 50 51 [root@f32 ~]# cd /sys/kernel/tracing/ 52 [root@f32 tracing]# echo osnoise > current_tracer 53 54It is possible to follow the trace by reading the trace file:: 55 56 [root@f32 tracing]# cat trace 57 # tracer: osnoise 58 # 59 # _-----=> irqs-off 60 # / _----=> need-resched 61 # | / _---=> hardirq/softirq 62 # || / _--=> preempt-depth MAX 63 # || / SINGLE Interference counters: 64 # |||| RUNTIME NOISE % OF CPU NOISE +-----------------------------+ 65 # TASK-PID CPU# |||| TIMESTAMP IN US IN US AVAILABLE IN US HW NMI IRQ SIRQ THREAD 66 # | | | |||| | | | | | | | | | | 67 <...>-859 [000] .... 81.637220: 1000000 190 99.98100 9 18 0 1007 18 1 68 <...>-860 [001] .... 81.638154: 1000000 656 99.93440 74 23 0 1006 16 3 69 <...>-861 [002] .... 81.638193: 1000000 5675 99.43250 202 6 0 1013 25 21 70 <...>-862 [003] .... 81.638242: 1000000 125 99.98750 45 1 0 1011 23 0 71 <...>-863 [004] .... 81.638260: 1000000 1721 99.82790 168 7 0 1002 49 41 72 <...>-864 [005] .... 81.638286: 1000000 263 99.97370 57 6 0 1006 26 2 73 <...>-865 [006] .... 81.638302: 1000000 109 99.98910 21 3 0 1006 18 1 74 <...>-866 [007] .... 81.638326: 1000000 7816 99.21840 107 8 0 1016 39 19 75 76In addition to the regular trace fields (from TASK-PID to TIMESTAMP), the 77tracer prints a message at the end of each period for each CPU that is 78running an osnoise/ thread. The osnoise specific fields report: 79 80 - The RUNTIME IN US reports the amount of time in microseconds that 81 the osnoise thread kept looping reading the time. 82 - The NOISE IN US reports the sum of noise in microseconds observed 83 by the osnoise tracer during the associated runtime. 84 - The % OF CPU AVAILABLE reports the percentage of CPU available for 85 the osnoise thread during the runtime window. 86 - The MAX SINGLE NOISE IN US reports the maximum single noise observed 87 during the runtime window. 88 - The Interference counters display how many each of the respective 89 interference happened during the runtime window. 90 91Note that the example above shows a high number of HW noise samples. 92The reason being is that this sample was taken on a virtual machine, 93and the host interference is detected as a hardware interference. 94 95Tracer options 96--------------------- 97 98The tracer has a set of options inside the osnoise directory, they are: 99 100 - osnoise/cpus: CPUs at which a osnoise thread will execute. 101 - osnoise/period_us: the period of the osnoise thread. 102 - osnoise/runtime_us: how long an osnoise thread will look for noise. 103 - osnoise/stop_tracing_us: stop the system tracing if a single noise 104 higher than the configured value happens. Writing 0 disables this 105 option. 106 - osnoise/stop_tracing_total_us: stop the system tracing if total noise 107 higher than the configured value happens. Writing 0 disables this 108 option. 109 - tracing_threshold: the minimum delta between two time() reads to be 110 considered as noise, in us. When set to 0, the default value will 111 be used, which is currently 5 us. 112 113Additional Tracing 114------------------ 115 116In addition to the tracer, a set of tracepoints were added to 117facilitate the identification of the osnoise source. 118 119 - osnoise:sample_threshold: printed anytime a noise is higher than 120 the configurable tolerance_ns. 121 - osnoise:nmi_noise: noise from NMI, including the duration. 122 - osnoise:irq_noise: noise from an IRQ, including the duration. 123 - osnoise:softirq_noise: noise from a SoftIRQ, including the 124 duration. 125 - osnoise:thread_noise: noise from a thread, including the duration. 126 127Note that all the values are *net values*. For example, if while osnoise 128is running, another thread preempts the osnoise thread, it will start a 129thread_noise duration at the start. Then, an IRQ takes place, preempting 130the thread_noise, starting a irq_noise. When the IRQ ends its execution, 131it will compute its duration, and this duration will be subtracted from 132the thread_noise, in such a way as to avoid the double accounting of the 133IRQ execution. This logic is valid for all sources of noise. 134 135Here is one example of the usage of these tracepoints:: 136 137 osnoise/8-961 [008] d.h. 5789.857532: irq_noise: local_timer:236 start 5789.857529929 duration 1845 ns 138 osnoise/8-961 [008] dNh. 5789.858408: irq_noise: local_timer:236 start 5789.858404871 duration 2848 ns 139 migration/8-54 [008] d... 5789.858413: thread_noise: migration/8:54 start 5789.858409300 duration 3068 ns 140 osnoise/8-961 [008] .... 5789.858413: sample_threshold: start 5789.858404555 duration 8812 ns interferences 2 141 142In this example, a noise sample of 8 microseconds was reported in the last 143line, pointing to two interferences. Looking backward in the trace, the 144two previous entries were about the migration thread running after a 145timer IRQ execution. The first event is not part of the noise because 146it took place one millisecond before. 147 148It is worth noticing that the sum of the duration reported in the 149tracepoints is smaller than eight us reported in the sample_threshold. 150The reason roots in the overhead of the entry and exit code that happens 151before and after any interference execution. This justifies the dual 152approach: measuring thread and tracing. 153