1# SPDX-License-Identifier: GPL-2.0+ 2# 3# Copyright (c) 2013 The Chromium OS Authors. 4 5Tracing in U-Boot 6================= 7 8U-Boot supports a simple tracing feature which allows a record of excecution 9to be collected and sent to a host machine for analysis. At present the 10main use for this is to profile boot time. 11 12 13Overview 14-------- 15 16The trace feature uses GCC's instrument-functions feature to trace all 17function entry/exit points. These are then recorded in a memory buffer. 18The memory buffer can be saved to the host over a network link using 19tftpput or by writing to an attached memory device such as MMC. 20 21On the host, the file is first converted with a tool called 'proftool', 22which extracts useful information from it. The resulting trace output 23resembles that emitted by Linux's ftrace feature, so can be visually 24displayed by pytimechart. 25 26 27Quick-start using Sandbox 28------------------------- 29 30Sandbox is a build of U-Boot that can run under Linux so it is a convenient 31way of trying out tracing before you use it on your actual board. To do 32this, follow these steps: 33 34Add the following to include/configs/sandbox.h (if not already there) 35 36#define CONFIG_TRACE 37#define CONFIG_CMD_TRACE 38#define CONFIG_TRACE_BUFFER_SIZE (16 << 20) 39#define CONFIG_TRACE_EARLY_SIZE (8 << 20) 40#define CONFIG_TRACE_EARLY 41#define CONFIG_TRACE_EARLY_ADDR 0x00100000 42 43Build sandbox U-Boot with tracing enabled: 44 45$ make FTRACE=1 O=sandbox sandbox_config 46$ make FTRACE=1 O=sandbox 47 48Run sandbox, wait for a bit of trace information to appear, and then capture 49a trace: 50 51$ ./sandbox/u-boot 52 53 54U-Boot 2013.04-rc2-00100-ga72fcef (Apr 17 2013 - 19:25:24) 55 56DRAM: 128 MiB 57trace: enabled 58Using default environment 59 60In: serial 61Out: serial 62Err: serial 63=>trace stats 64 671,406 function sites 65 69,712 function calls 66 0 untracked function calls 67 73,373 traced function calls 68 16 maximum observed call depth 69 15 call depth limit 70 66,491 calls not traced due to depth 71=>trace stats 72 671,406 function sites 73 1,279,450 function calls 74 0 untracked function calls 75 950,490 traced function calls (333217 dropped due to overflow) 76 16 maximum observed call depth 77 15 call depth limit 78 1,275,767 calls not traced due to depth 79=>trace calls 0 e00000 80Call list dumped to 00000000, size 0xae0a40 81=>print 82baudrate=115200 83profbase=0 84profoffset=ae0a40 85profsize=e00000 86stderr=serial 87stdin=serial 88stdout=serial 89 90Environment size: 117/8188 bytes 91=>host save host 0 trace 0 ${profoffset} 9211405888 bytes written in 10 ms (1.1 GiB/s) 93=>reset 94 95 96Then run proftool to convert the trace information to ftrace format. 97 98$ ./sandbox/tools/proftool -m sandbox/System.map -p trace dump-ftrace >trace.txt 99 100Finally run pytimechart to display it: 101 102$ pytimechart trace.txt 103 104Using this tool you can zoom and pan across the trace, with the function 105calls on the left and little marks representing the start and end of each 106function. 107 108 109CONFIG Options 110-------------- 111 112- CONFIG_TRACE 113 Enables the trace feature in U-Boot. 114 115- CONFIG_CMD_TRACE 116 Enables the trace command. 117 118- CONFIG_TRACE_BUFFER_SIZE 119 Size of trace buffer to allocate for U-Boot. This buffer is 120 used after relocation, as a place to put function tracing 121 information. The address of the buffer is determined by 122 the relocation code. 123 124- CONFIG_TRACE_EARLY 125 Define this to start tracing early, before relocation. 126 127- CONFIG_TRACE_EARLY_SIZE 128 Size of 'early' trace buffer. Before U-Boot has relocated 129 it doesn't have a proper trace buffer. On many boards 130 you can define an area of memory to use for the trace 131 buffer until the 'real' trace buffer is available after 132 relocation. The contents of this buffer are then copied to 133 the real buffer. 134 135- CONFIG_TRACE_EARLY_ADDR 136 Address of early trace buffer 137 138 139Building U-Boot with Tracing Enabled 140------------------------------------ 141 142Pass 'FTRACE=1' to the U-Boot Makefile to actually instrument the code. 143This is kept as a separate option so that it is easy to enable/disable 144instrumenting from the command line instead of having to change board 145config files. 146 147 148Collecting Trace Data 149--------------------- 150 151When you run U-Boot on your board it will collect trace data up to the 152limit of the trace buffer size you have specified. Once that is exhausted 153no more data will be collected. 154 155Collecting trace data has an affect on execution time/performance. You 156will notice this particularly with trvial functions - the overhead of 157recording their execution may even exceed their normal execution time. 158In practice this doesn't matter much so long as you are aware of the 159effect. Once you have done your optimisations, turn off tracing before 160doing end-to-end timing. 161 162The best time to start tracing is right at the beginning of U-Boot. The 163best time to stop tracing is right at the end. In practice it is hard 164to achieve these ideals. 165 166This implementation enables tracing early in board_init_f(). This means 167that it captures most of the board init process, missing only the 168early architecture-specific init. However, it also misses the entire 169SPL stage if there is one. 170 171U-Boot typically ends with a 'bootm' command which loads and runs an 172OS. There is useful trace data in the execution of that bootm 173command. Therefore this implementation provides a way to collect trace 174data after bootm has finished processing, but just before it jumps to 175the OS. In practical terms, U-Boot runs the 'fakegocmd' environment 176variable at this point. This variable should have a short script which 177collects the trace data and writes it somewhere. 178 179Trace data collection relies on a microsecond timer, accesed through 180timer_get_us(). So the first think you should do is make sure that 181this produces sensible results for your board. Suitable sources for 182this timer include high resolution timers, PWMs or profile timers if 183available. Most modern SOCs have a suitable timer for this. Make sure 184that you mark this timer (and anything it calls) with 185__attribute__((no_instrument_function)) so that the trace library can 186use it without causing an infinite loop. 187 188 189Commands 190-------- 191 192The trace command has variable sub-commands: 193 194- stats 195 Display tracing statistics 196 197- pause 198 Pause tracing 199 200- resume 201 Resume tracing 202 203- funclist [<addr> <size>] 204 Dump a list of functions into the buffer 205 206- calls [<addr> <size>] 207 Dump function call trace into buffer 208 209If the address and size are not given, these are obtained from environment 210variables (see below). In any case the environment variables are updated 211after the command runs. 212 213 214Environment Variables 215--------------------- 216 217The following are used: 218 219- profbase 220 Base address of trace output buffer 221 222- profoffset 223 Offset of first unwritten byte in trace output buffer 224 225- profsize 226 Size of trace output buffer 227 228All of these are set by the 'trace calls' command. 229 230These variables keep track of the amount of data written to the trace 231output buffer by the 'trace' command. The trace commands which write data 232to the output buffer can use these to specify the buffer to write to, and 233update profoffset each time. This allows successive commands to append data 234to the same buffer, for example: 235 236 trace funclist 10000 e00000 237 trace calls 238 239(the latter command appends more data to the buffer). 240 241 242- fakegocmd 243 Specifies commands to run just before booting the OS. This 244 is a useful time to write the trace data to the host for 245 processing. 246 247 248Writing Out Trace Data 249---------------------- 250 251Once the trace data is in an output buffer in memory there are various ways 252to transmit it to the host. Notably you can use tftput to send the data 253over a network link: 254 255fakegocmd=trace pause; usb start; set autoload n; bootp; 256 trace calls 10000000 1000000; 257 tftpput ${profbase} ${profoffset} 192.168.1.4:/tftpboot/calls 258 259This starts up USB (to talk to an attached USB Ethernet dongle), writes 260a trace log to address 10000000 and sends it to a host machine using 261TFTP. After this, U-Boot will boot the OS normally, albeit a little 262later. 263 264 265Converting Trace Output Data 266---------------------------- 267 268The trace output data is kept in a binary format which is not documented 269here. To convert it into something useful, you can use proftool. 270 271This tool must be given the U-Boot map file and the trace data received 272from running that U-Boot. It produces a text output file. 273 274Options 275 -m <map_file> 276 Specify U-Boot map file 277 278 -p <trace_file> 279 Specifiy profile/trace file 280 281Commands: 282 283- dump-ftrace 284 Write a text dump of the file in Linux ftrace format to stdout 285 286 287Viewing the Trace Data 288---------------------- 289 290You can use pytimechart for this (sudo apt-get pytimechart might work on 291your Debian-style machine, and use your favourite search engine to obtain 292documentation). It expects the file to have a .txt extension. The program 293has terse user interface but is very convenient for viewing U-Boot 294profile information. 295 296 297Workflow Suggestions 298-------------------- 299 300The following suggestions may be helpful if you are trying to reduce boot 301time: 302 3031. Enable CONFIG_BOOTSTAGE and CONFIG_BOOTSTAGE_REPORT. This should get 304you are helpful overall snapshot of the boot time. 305 3062. Build U-Boot with tracing and run it. Note the difference in boot time 307(it is common for tracing to add 10% to the time) 308 3093. Collect the trace information as descibed above. Use this to find where 310all the time is being spent. 311 3124. Take a look at that code and see if you can optimise it. Perhaps it is 313possible to speed up the initialisation of a device, or remove an unused 314feature. 315 3165. Rebuild, run and collect again. Compare your results. 317 3186. Keep going until you run out of steam, or your boot is fast enough. 319 320 321Configuring Trace 322----------------- 323 324There are a few parameters in the code that you may want to consider. 325There is a function call depth limit (set to 15 by default). When the 326stack depth goes above this then no tracing information is recorded. 327The maximum depth reached is recorded and displayed by the 'trace stats' 328command. 329 330 331Future Work 332----------- 333 334Tracing could be a little tidier in some areas, for example providing 335run-time configuration options for trace. 336 337Some other features that might be useful: 338 339- Trace filter to select which functions are recorded 340- Sample-based profiling using a timer interrupt 341- Better control over trace depth 342- Compression of trace information 343 344 345Simon Glass <sjg@chromium.org> 346April 2013 347