1############### 2Timerlat tracer 3############### 4 5The timerlat tracer aims to help the preemptive kernel developers to 6find sources of wakeup latencies of real-time threads. Like cyclictest, 7the tracer sets a periodic timer that wakes up a thread. The thread then 8computes a *wakeup latency* value as the difference between the *current 9time* and the *absolute time* that the timer was set to expire. The main 10goal of timerlat is tracing in such a way to help kernel developers. 11 12Usage 13----- 14 15Write the ASCII text "timerlat" into the current_tracer file of the 16tracing system (generally mounted at /sys/kernel/tracing). 17 18For example:: 19 20 [root@f32 ~]# cd /sys/kernel/tracing/ 21 [root@f32 tracing]# echo timerlat > current_tracer 22 23It is possible to follow the trace by reading the trace trace file:: 24 25 [root@f32 tracing]# cat trace 26 # tracer: timerlat 27 # 28 # _-----=> irqs-off 29 # / _----=> need-resched 30 # | / _---=> hardirq/softirq 31 # || / _--=> preempt-depth 32 # || / 33 # |||| ACTIVATION 34 # TASK-PID CPU# |||| TIMESTAMP ID CONTEXT LATENCY 35 # | | | |||| | | | | 36 <idle>-0 [000] d.h1 54.029328: #1 context irq timer_latency 932 ns 37 <...>-867 [000] .... 54.029339: #1 context thread timer_latency 11700 ns 38 <idle>-0 [001] dNh1 54.029346: #1 context irq timer_latency 2833 ns 39 <...>-868 [001] .... 54.029353: #1 context thread timer_latency 9820 ns 40 <idle>-0 [000] d.h1 54.030328: #2 context irq timer_latency 769 ns 41 <...>-867 [000] .... 54.030330: #2 context thread timer_latency 3070 ns 42 <idle>-0 [001] d.h1 54.030344: #2 context irq timer_latency 935 ns 43 <...>-868 [001] .... 54.030347: #2 context thread timer_latency 4351 ns 44 45 46The tracer creates a per-cpu kernel thread with real-time priority that 47prints two lines at every activation. The first is the *timer latency* 48observed at the *hardirq* context before the activation of the thread. 49The second is the *timer latency* observed by the thread. The ACTIVATION 50ID field serves to relate the *irq* execution to its respective *thread* 51execution. 52 53The *irq*/*thread* splitting is important to clarify in which context 54the unexpected high value is coming from. The *irq* context can be 55delayed by hardware-related actions, such as SMIs, NMIs, IRQs, 56or by thread masking interrupts. Once the timer happens, the delay 57can also be influenced by blocking caused by threads. For example, by 58postponing the scheduler execution via preempt_disable(), scheduler 59execution, or masking interrupts. Threads can also be delayed by the 60interference from other threads and IRQs. 61 62Tracer options 63--------------------- 64 65The timerlat tracer is built on top of osnoise tracer. 66So its configuration is also done in the osnoise/ config 67directory. The timerlat configs are: 68 69 - cpus: CPUs at which a timerlat thread will execute. 70 - timerlat_period_us: the period of the timerlat thread. 71 - stop_tracing_us: stop the system tracing if a 72 timer latency at the *irq* context higher than the configured 73 value happens. Writing 0 disables this option. 74 - stop_tracing_total_us: stop the system tracing if a 75 timer latency at the *thread* context is higher than the configured 76 value happens. Writing 0 disables this option. 77 - print_stack: save the stack of the IRQ occurrence, and print 78 it after the *thread context* event". 79 80timerlat and osnoise 81---------------------------- 82 83The timerlat can also take advantage of the osnoise: traceevents. 84For example:: 85 86 [root@f32 ~]# cd /sys/kernel/tracing/ 87 [root@f32 tracing]# echo timerlat > current_tracer 88 [root@f32 tracing]# echo 1 > events/osnoise/enable 89 [root@f32 tracing]# echo 25 > osnoise/stop_tracing_total_us 90 [root@f32 tracing]# tail -10 trace 91 cc1-87882 [005] d..h... 548.771078: #402268 context irq timer_latency 13585 ns 92 cc1-87882 [005] dNLh1.. 548.771082: irq_noise: local_timer:236 start 548.771077442 duration 7597 ns 93 cc1-87882 [005] dNLh2.. 548.771099: irq_noise: qxl:21 start 548.771085017 duration 7139 ns 94 cc1-87882 [005] d...3.. 548.771102: thread_noise: cc1:87882 start 548.771078243 duration 9909 ns 95 timerlat/5-1035 [005] ....... 548.771104: #402268 context thread timer_latency 39960 ns 96 97In this case, the root cause of the timer latency does not point to a 98single cause but to multiple ones. Firstly, the timer IRQ was delayed 99for 13 us, which may point to a long IRQ disabled section (see IRQ 100stacktrace section). Then the timer interrupt that wakes up the timerlat 101thread took 7597 ns, and the qxl:21 device IRQ took 7139 ns. Finally, 102the cc1 thread noise took 9909 ns of time before the context switch. 103Such pieces of evidence are useful for the developer to use other 104tracing methods to figure out how to debug and optimize the system. 105 106It is worth mentioning that the *duration* values reported 107by the osnoise: events are *net* values. For example, the 108thread_noise does not include the duration of the overhead caused 109by the IRQ execution (which indeed accounted for 12736 ns). But 110the values reported by the timerlat tracer (timerlat_latency) 111are *gross* values. 112 113The art below illustrates a CPU timeline and how the timerlat tracer 114observes it at the top and the osnoise: events at the bottom. Each "-" 115in the timelines means circa 1 us, and the time moves ==>:: 116 117 External timer irq thread 118 clock latency latency 119 event 13585 ns 39960 ns 120 | ^ ^ 121 v | | 122 |-------------| | 123 |-------------+-------------------------| 124 ^ ^ 125 ======================================================================== 126 [tmr irq] [dev irq] 127 [another thread...^ v..^ v.......][timerlat/ thread] <-- CPU timeline 128 ========================================================================= 129 |-------| |-------| 130 |--^ v-------| 131 | | | 132 | | + thread_noise: 9909 ns 133 | +-> irq_noise: 6139 ns 134 +-> irq_noise: 7597 ns 135 136IRQ stacktrace 137--------------------------- 138 139The osnoise/print_stack option is helpful for the cases in which a thread 140noise causes the major factor for the timer latency, because of preempt or 141irq disabled. For example:: 142 143 [root@f32 tracing]# echo 500 > osnoise/stop_tracing_total_us 144 [root@f32 tracing]# echo 500 > osnoise/print_stack 145 [root@f32 tracing]# echo timerlat > current_tracer 146 [root@f32 tracing]# tail -21 per_cpu/cpu7/trace 147 insmod-1026 [007] dN.h1.. 200.201948: irq_noise: local_timer:236 start 200.201939376 duration 7872 ns 148 insmod-1026 [007] d..h1.. 200.202587: #29800 context irq timer_latency 1616 ns 149 insmod-1026 [007] dN.h2.. 200.202598: irq_noise: local_timer:236 start 200.202586162 duration 11855 ns 150 insmod-1026 [007] dN.h3.. 200.202947: irq_noise: local_timer:236 start 200.202939174 duration 7318 ns 151 insmod-1026 [007] d...3.. 200.203444: thread_noise: insmod:1026 start 200.202586933 duration 838681 ns 152 timerlat/7-1001 [007] ....... 200.203445: #29800 context thread timer_latency 859978 ns 153 timerlat/7-1001 [007] ....1.. 200.203446: <stack trace> 154 => timerlat_irq 155 => __hrtimer_run_queues 156 => hrtimer_interrupt 157 => __sysvec_apic_timer_interrupt 158 => asm_call_irq_on_stack 159 => sysvec_apic_timer_interrupt 160 => asm_sysvec_apic_timer_interrupt 161 => delay_tsc 162 => dummy_load_1ms_pd_init 163 => do_one_initcall 164 => do_init_module 165 => __do_sys_finit_module 166 => do_syscall_64 167 => entry_SYSCALL_64_after_hwframe 168 169In this case, it is possible to see that the thread added the highest 170contribution to the *timer latency* and the stack trace, saved during 171the timerlat IRQ handler, points to a function named 172dummy_load_1ms_pd_init, which had the following code (on purpose):: 173 174 static int __init dummy_load_1ms_pd_init(void) 175 { 176 preempt_disable(); 177 mdelay(1); 178 preempt_enable(); 179 return 0; 180 181 } 182