Lines Matching +full:oe +full:- +full:extra +full:- +full:delay
1 perf-intel-pt(1)
5 ----
6 perf-intel-pt - Support for Intel Processor Trace within perf tools
9 --------
11 'perf record' -e intel_pt//
14 -----------
19 Technical details are documented in the Intel 64 and IA-32 Architectures
23 processors that are based on the Intel micro-architecture code name Broadwell.
33 Decoding is done on-the-fly. The decoder outputs samples in the same format as
43 builds, however the executed images are needed - which makes use in JIT-compiled
44 environments, or with self-modified code, a challenge. Also symbols need to be
51 vary depending on the use-case and architecture.
55 ----------
61 Data is captured with 'perf record' e.g. to trace 'ls' userspace-only:
63 perf record -e intel_pt//u ls
69 To also trace kernel space presents a problem, namely kernel self-modifying
73 --kcore is used, but access to /proc/kcore is restricted e.g.
75 sudo perf record -o pt_ls --kcore -e intel_pt// -- ls
82 sudo perf report -i pt_ls
84 Because samples are synthesized after-the-fact, the sampling period can be
87 sudo perf report pt_ls --itrace=i1usge
89 See the sections below for more information about the --itrace option.
103 perf record -e intel_pt//u ls
104 perf script --itrace=iybxwpe
109 perf script --itrace=iybxwpe -F+flags
113 in transaction, VM-entry, VM-exit, interrupt disabled, and interrupt disable
118 perf script --insn-trace --xed
124 perf script --call-trace
128 perf script --call-ret-trace
133 perf script --time starttime,stoptime --insn-trace --xed
136 the -C option
138 perf script --time starttime,stoptime --insn-trace --xed -C 1
145 perf script --itrace=be -F+ipc
147 There are two ways that instructions-per-cycle (IPC) can be calculated depending
152 MTC packets are used - refer to the 'mtc' config term. When MTC is used, however,
170 useful to use the 'A' option in conjunction with dlfilter-show-cycles.so to
179 Another note, in the case of "branches" events, non-taken branches are not
181 TNT packet that starts with a non-taken branch. To see every possible IPC
182 value, "instructions" events can be used e.g. --itrace=i0ns
186 Refer to script export-to-sqlite.py or export-to-postgresql.py for more details,
187 and to script exported-sql-viewer.py for an example of using the database.
189 There is also script intel-pt-events.py which provides an example of how to
193 - --insn-trace - instruction trace
194 - --src-trace - source trace
196 The intel-pt-events.py script also has options:
198 - --all-switch-events - display all switch events, not only the last consecutive.
199 - --interleave [<n>] - interleave sample output for the same timestamp so that
209 by inability to access the executed image, self-modified or JIT-ed code, or the
210 inability to match side-band information (such as context switches and mmaps)
219 -----------
228 -e intel_pt//
232 -e intel_pt/tsc,noretcomp=0/
236 -e intel_pt/tsc=1,noretcomp=0/
238 Note there are now new config terms - see section 'config terms' further below.
245 $ grep -H . /sys/bus/event_source/devices/intel_pt/format/*
247 /sys/bus/event_source/devices/intel_pt/format/cyc_thresh:config:19-22
249 /sys/bus/event_source/devices/intel_pt/format/mtc_period:config:14-17
251 /sys/bus/event_source/devices/intel_pt/format/psb_period:config:24-27
256 -e intel_pt/noretcomp=0/
260 -e intel_pt/tsc=1,noretcomp=0/
264 -e intel_pt/tsc=0/
268 -e intel_pt/config=0x400/
283 perf_event_attr is displayed if the -vv option is used e.g.
285 ------------------------------------------------------------
299 ------------------------------------------------------------
300 sys_perf_event_open: pid 31104 cpu 0 group_fd -1 flags 0x8
301 sys_perf_event_open: pid 31104 cpu 1 group_fd -1 flags 0x8
302 sys_perf_event_open: pid 31104 cpu 2 group_fd -1 flags 0x8
303 sys_perf_event_open: pid 31104 cpu 3 group_fd -1 flags 0x8
304 ------------------------------------------------------------
310 The June 2015 version of Intel 64 and IA-32 Architectures Software Developer
317 without timing information, for example a per-thread context
357 $ perf record -e intel_pt/psb_period=15/u uname
358 Invalid psb_period for intel_pt. Valid values are: 0-5
384 The frequency of MTC packets can also be specified - see
387 mtc_period Specifies how frequently MTC packets are produced - see mtc
399 CTC-frequency / (2 ^ value)
401 e.g. value 3 means one eighth of CTC-frequency
409 $ perf record -e intel_pt/mtc_period=15/u uname
430 a threshold - see cyc_thresh below.
432 cyc_thresh Specifies how frequently CYC packets are produced - see cyc
446 2 ^ (value - 1)
455 $ perf record -e intel_pt/cyc,cyc_thresh=15/u uname
456 Invalid cyc_thresh for intel_pt. Valid values are: 0-12
460 pt Specifies pass-through which enables the 'branch' config term.
489 changes to the CPU C-state.
511 return compression is disabled - see noretcomp) return statements.
528 --aux-sample
532 --aux-sample=8192
536 -e intel_pt//u
539 following will create Intel PT samples on the branch-misses event, note the
542 perf record --aux-sample -e '{intel_pt//u,branch-misses:u}'
544 An alternative to '--aux-sample' is to add the config term 'aux-sample-size' to
547 perf record -e intel_pt//u -e branch-misses/aux-sample-size=8192/u
551 perf record -e '{intel_pt//u,branch-misses/aux-sample-size=8192/u}'
555 …perf record -e intel_pt//u --filter 'filter * @/bin/ls' -e branch-misses/aux-sample-size=8192/u --…
585 -S
589 -S0x100000
597 The snapshot size is displayed if the option -vv is used e.g.
605 Intel PT buffer size is specified by an addition to the -m option e.g.
607 -m,16
611 Note that the existing functionality of -m is unchanged. The auxtrace mmap size
625 In full-trace mode, powers of two are allowed for buffer size, with a minimum
629 The mmap size and auxtrace mmap size are displayed if the -vv option is used e.g.
639 full-trace mode
643 Full-trace mode traces continuously e.g.
645 perf record -e intel_pt//u uname
649 perf record --aux-sample -e intel_pt//u -e branch-misses:u
654 perf record -v -e intel_pt//u -S ./loopy 1000000000 &
656 kill -USR2 11435
660 Note that "Recording AUX area tracing snapshot" is displayed because the -v
670 $ sudo ~/bin/perf record --control fifo:perf.control,perf.ack -S -e intel_pt//u -- sleep 60 &
672 $ ps -e | grep perf
674 $ kill -USR2 15244
675 bash: kill: (15244) - Operation not permitted
698 In full-trace mode, the driver waits for data to be copied out before allowing
699 the (logical) buffer to wrap-around. If data is not copied out quickly enough,
702 that happens, perf tools always re-enable the intel_pt event after copying out
709 By default "perf record" post-processes the event stream to find all build ids
717 perf buildid-list
721 perf buildid-list --with-hits
729 collection of side-band information. In order to prevent that, a dummy
732 there is complete side-band information to allow the decoding of subsequent
755 "per thread" mode is selected by -t or by --per-thread (with -p or -u or just a
757 "per cpu" is selected by -C or -a.
761 In per-thread mode an exact list of threads is traced. There is no inheritance.
764 In per-cpu mode all processes (or processes from the selected cgroup i.e. -G
765 option, or processes selected with -p or -u) are traced. Each cpu has its own
768 In workload-only mode, the workload is traced but with per-cpu buffers.
769 Inheritance is allowed. Note that you can now trace a workload in per-thread
770 mode by using the --per-thread option.
773 Privileged vs non-privileged users
776 Unless /proc/sys/kernel/perf_event_paranoid is set to -1, unprivileged users
792 Unless /proc/sys/kernel/perf_event_paranoid is set to -1, unprivileged users are
793 not permitted to use tracepoints which means there is insufficient side-band
794 information to decode Intel PT in per-cpu mode, and potentially workload-only
797 Note also, that to use tracepoints, read-access to debugfs is required. So if
798 debugfs is not mounted or the user does not have read-access, it will again not
799 be possible to decode Intel PT in per-cpu mode.
805 The sched_switch tracepoint is used to provide side-band data for Intel PT
812 $ perf record -vv -e intel_pt//u uname
813 ------------------------------------------------------------
827 ------------------------------------------------------------
828 sys_perf_event_open: pid 31104 cpu 0 group_fd -1 flags 0x8
829 sys_perf_event_open: pid 31104 cpu 1 group_fd -1 flags 0x8
830 sys_perf_event_open: pid 31104 cpu 2 group_fd -1 flags 0x8
831 sys_perf_event_open: pid 31104 cpu 3 group_fd -1 flags 0x8
832 ------------------------------------------------------------
843 ------------------------------------------------------------
844 sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8
845 sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8
846 sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8
847 sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8
848 ------------------------------------------------------------
867 ------------------------------------------------------------
868 sys_perf_event_open: pid 31104 cpu 0 group_fd -1 flags 0x8
869 sys_perf_event_open: pid 31104 cpu 1 group_fd -1 flags 0x8
870 sys_perf_event_open: pid 31104 cpu 2 group_fd -1 flags 0x8
871 sys_perf_event_open: pid 31104 cpu 3 group_fd -1 flags 0x8
881 and only in per-cpu mode.
889 -----------
892 This can be further controlled by new option --itrace.
895 New --itrace option
900 --itrace
904 --itrace=cepwxy
916 o synthesize PEBS-via-PT events
927 Z prefer to ignore timestamps (so-called "timeless" decoding)
929 "Instructions" events look like they were recorded by "perf record -e
932 "Cycles" events look like they were recorded by "perf record -e cycles"
942 "Branches" events look like they were recorded by "perf record -e branches". "c"
960 "Power" events correspond to power event packets and CBR (core-to-bus ratio)
964 C-state changes, whereas CBR is indicative of CPU frequency. perf script
969 pwre: hw: 0 cstate: 2 sub-cstate: 0
975 "cbr" includes the frequency and the percentage of maximum non-turbo
977 "pwre" shows C-state transitions (to a C-state deeper than C0) and
983 For more details refer to the Intel 64 and IA-32 Architectures Software
986 PSB events show when a PSB+ occurred and also the byte-offset in the trace.
987 Emitting a PSB+ can cause a CPU a slight delay. When doing timing analysis
994 will or will not be reported. Each flag must be preceded by either '+' or '-'.
997 -o Suppress overflow errors
998 -l Suppress trace data lost errors
1002 --itrace=e-o-l
1008 must be preceded by either '+' or '-'. The flags support by Intel PT are:
1010 -a Suppress logging of perf events
1018 linkperf:perf-config[1] e.g. perf config itrace.debug-log-buffer-size=30000
1022 --itrace=i10us
1040 'instructions' (i.e. --itrace=i1i).
1045 --itrace=ig32
1046 --itrace=xg32
1051 --itrace=il10
1052 --itrace=xl10
1059 instead of synthesized events. For example, to record branch-misses events for
1062 perf record --aux-sample -e '{intel_pt//u,branch-misses:u}' -- ls
1063 perf report --itrace=Ge
1075 - hardware supports it
1076 - PEBS is used
1077 - event period is specified, instead of frequency
1078 - the sample type is limited to the following flags:
1087 cases, avoid specifying the event period i.e. avoid the 'perf record' -c option,
1088 --count option, or 'period' config term.
1090 To disable trace decoding entirely, use the option --no-itrace.
1095 --itrace=i0nss1000000
1106 ranges that could then be decoded fully using the --time option.
1110 - direct calls and jmps
1111 - conditional branches
1112 - non-branch instructions
1116 - asynchronous branches such as interrupts
1117 - indirect branches
1118 - function return target address *if* the noretcomp config term (refer
1120 - start of (control-flow) tracing
1121 - end of (control-flow) tracing, if it is not out of context
1122 - power events, ptwrite, transaction start and abort
1123 - instruction pointer associated with PSB packets
1128 Repeating the q option (double-q i.e. qq) results in even faster decoding and even
1137 - everything except instruction pointer associated with PSB packets
1141 - instruction pointer associated with PSB packets
1148 dlfilter-show-cycles.so
1151 Cycles can be displayed using dlfilter-show-cycles.so in which case the itrace A
1154 perf script --itrace=A --call-trace --dlfilter dlfilter-show-cycles.so
1158 perf script -v --list-dlfilters
1160 See also linkperf:perf-dlfilters[1]
1166 perf script has an option (-D) to "dump" the events i.e. display the binary
1169 When -D is used, Intel PT packets are displayed. The packet decoder does not
1170 pay attention to PSB packets, but just decodes the bytes - so the packets seen
1172 One example of that would be when the buffer-switching interrupt has been too
1177 To disable the display of Intel PT packets, combine the -D option with
1178 --no-itrace.
1182 -----------
1185 This can be further controlled by new option --itrace exactly the same as
1186 perf script, with the exception that the default is --itrace=igxe.
1190 -----------
1192 perf inject also accepts the --itrace option in which case tracing data is
1195 perf inject --itrace -i perf.data -o perf.data.new
1202 $ gcc-5 -O3 sort.c -o sort_optimized
1208 [intel-pt]
1209 mispred-all = on
1211 $ perf record -e intel_pt//u ./sort 3000
1216 $ perf inject -i perf.data -o inj --itrace=i100usle --strip
1217 $ ./create_gcov --binary=./sort --profile=inj --gcov=sort.gcov -gcov_version=1
1218 $ gcc-5 -O3 -fauto-profile=sort.gcov sort.c -o sort_autofdo
1228 -----------------
1231 Recording is selected by using the aux-output config term e.g.
1233 perf record -c 10000 -e '{intel_pt/branch=0/,cycles/aux-output/ppp}' uname
1237 kernels and perf tools add support for the PERF_RECORD_AUX_OUTPUT_HW_ID side-band event.
1238 To check for the presence of that event in a PEBS-via-PT trace:
1240 perf script -D --no-itrace | grep PERF_RECORD_AUX_OUTPUT_HW_ID
1244 perf script --itrace=oe
1247 ---
1249 include::build-xed.txt[]
1253 --------------------------------------
1256 (i.e. no TSC timestamps) or VM Time Correlation. VM Time Correlation is an extra step
1263 …Guest kernel self-modifying code (e.g. jump labels or JIT-compiled eBPF) will result in decoding e…
1275 Mount the guest file system. Note sshfs needs -o direct_io to enable reading of proc files. root …
1278 $ sshfs -o direct_io root@vm0:/ vm0
1282 $ perf buildid-cache -v --kcore vm0/proc/kcore
1283 …kcore added to build-id cache directory /home/user/.debug/[kernel.kcore]/9600f316a53a0f54278885e8d…
1288 $ ps -eLl | grep 'KVM\|PID'
1290 3 S 64055 1430 1 1440 1 80 0 - 1921718 - ? 00:02:47 CPU 0/KVM
1291 3 S 64055 1430 1 1441 1 80 0 - 1921718 - ? 00:02:41 CPU 1/KVM
1292 3 S 64055 1430 1 1442 1 80 0 - 1921718 - ? 00:02:38 CPU 2/KVM
1293 3 S 64055 1430 1 1443 2 80 0 - 1921718 - ? 00:03:18 CPU 3/KVM
1295 Start an open-ended perf record, tracing the VM process, do something on the VM, and then ctrl-C to…
1299 Intel PT traces both the host and the guest so --guest and --host need to be specified.
1300 Without timestamps, --per-thread must be specified to distinguish threads.
1302 …$ sudo perf kvm --guest --host --guestkallsyms $KALLSYMS record --kcore -e intel_pt/tsc=0,mtc=0,cy…
1309 $ perf script --guestkallsyms $KALLSYMS --insn-trace --xed -F+ipc | grep -C10 vmresume | head -21
1339 Mount the guest file system. Note sshfs needs -o direct_io to enable reading of proc files. root …
1341 $ mkdir -p vm0
1342 $ sshfs -o direct_io root@vm0:/ vm0
1346 $ perf buildid-cache -v --kcore vm0/proc/kcore
1352 $ ps -eLl | grep 'KVM\|PID'
1354 3 S 64055 16998 1 17005 13 80 0 - 1818189 - ? 00:00:16 CPU 0/KVM
1355 3 S 64055 16998 1 17006 4 80 0 - 1818189 - ? 00:00:05 CPU 1/KVM
1356 3 S 64055 16998 1 17007 3 80 0 - 1818189 - ? 00:00:04 CPU 2/KVM
1357 3 S 64055 16998 1 17008 4 80 0 - 1818189 - ? 00:00:05 CPU 3/KVM
1359 Start an open-ended perf record, tracing the VM process, do something on the VM, and then ctrl-C to…
1362 Intel PT traces both the host and the guest so --guest and --host need to be specified.
1364 …$ sudo perf kvm --guest --host --guestkallsyms $KALLSYMS record --kcore -e intel_pt/cyc=1/k -p 169…
1369 only 7-bytes, so the TSC Offset might differ from the actual value in the 8th byte. That will
1372 $ perf inject -i perf.data.kvm --vm-time-correlation=dry-run
1388 $ perf inject -i perf.data.kvm --vm-time-correlation="dry-run 0xffffe42722c64c41"
1390 Note the options for 'perf inject' --vm-time-correlation are:
1392 [ dry-run ] [ <TSC Offset> [ : <VMCS> [ , <VMCS> ]... ] ]...
1395 The option "dry-run" will cause the file to be processed but without updating it.
1396 Note it is also possible to get a intel_pt.log file by adding option --itrace=d
1400 $ perf inject -i perf.data.kvm --vm-time-correlation=0xffffe42722c64c41 --force
1404 $ perf script -i perf.data.kvm --guestkallsyms $KALLSYMS --itrace=e-o
1410 …$ perf script -i perf.data.kvm --guestkallsyms $KALLSYMS --insn-trace --xed -F+ipc | grep -C10 vmr…
1435 -----------------------------------------------
1444 Check that no-kvmclock kernel command line option was used to boot:
1449 …BOOT_IMAGE=/boot/vmlinuz-5.10.0-16-amd64 root=UUID=cb49c910-e573-47e0-bce7-79e293df8e1d ro no-kvmc…
1458 …$ sudo perf record -o guest-sideband-testing-guest-perf.data --sample-identifier --buildid-all --s…
1466 $ sudo perf record -o guest-sideband-testing-host-perf.data -m,64M --kcore -a -e intel_pt/cyc/
1481 [ perf record: Captured and wrote 76.122 MB guest-sideband-testing-host-perf.data ]
1489 [ perf record: Captured and wrote 1.247 MB guest-sideband-testing-guest-perf.data ]
1491 And then copy guest-sideband-testing-guest-perf.data to the host (not shown here).
1499 $ perf inject -i guest-sideband-testing-host-perf.data --vm-time-correlation=dry-run
1507 …$ perf inject -i guest-sideband-testing-host-perf.data --vm-time-correlation=0xfffffa6ae070cb20 --…
1511 $ perf script -i guest-sideband-testing-host-perf.data --no-itrace --show-task-events | grep KVM
1517 Note, the QEMU option -name debug-threads=on is needed so that thread names
1522 $ mkdir -p ~/guestmount/13376
1523 $ sshfs -o direct_io vm_to_test:/ ~/guestmount/13376
1528 If needed, VDSO can be copied manually in a fashion similar to that used by the perf-archive script.
1530 …$ perf inject -i guest-sideband-testing-host-perf.data -o inj --guestmount ~/guestmount --guest-da…
1536 - the CPU displayed, [002] in this case, is always the host CPU
1537 …- events happening in the virtual machine start with VM:13376 VCPU:003, which shows the hypervisor…
1538 - only calls and errors are displayed i.e. --itrace=ce
1539 …- branches entering and exiting the virtual machine are split, and show as 2 branches to/from "0 […
1541 …$ perf script -i inj --itrace=ce -F+machine_pid,+vcpu,+addr,+pid,+tid,-period --ns --time 7919.408…
1546 …nown] ([unknown]) => 7f851c9b5a5c init_cacheinfo+0x3ac (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
1547 … branches: 7f851c9b5a5a init_cacheinfo+0x3aa (/usr/lib/x86_64-linux-gnu/libc-2.31.so) => …
1597 …nown] ([unknown]) => 7f851c9b5a5c init_cacheinfo+0x3ac (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
1598 …dl_init+0x74 (/usr/lib/x86_64-linux-gnu/ld-2.31.so) => 7f851cb7bf50 call_init.part.0+0x0 (/usr…
1607 Tracing Virtual Machines - Guest Code
1608 -------------------------------------
1615 addresses. To support that, option "--guest-code" has been added to perf script
1620 …# perf record --kcore -e intel_pt/cyc/ -- tools/testing/selftests/kselftest_install/kvm/tsc_msrs_t…
1623 # perf script --guest-code --itrace=bep --ns -F-period,+addr,+flags
1653 # perf kvm --guest-code --guest --host report -i perf.data --stdio | head -20
1655 # To display the perf.data header info, please use --header/--header-only options.
1668 ---entry_SYSCALL_64_after_hwframe
1671 |--29.44%--syscall_exit_to_user_mode
1678 -----------
1691 7 VMENTRY VM-Entry
1692 8 VMEXIT VM-Entry
1693 9 VMEXIT_INTR VM-Exit due to interrupt
1696 For more details, refer to the Intel 64 and IA-32 Architectures Software
1704 perf record -e intel_pt/event/u uname
1706 Event trace events are output using the --itrace I option. e.g.
1708 perf script --itrace=Ie
1721 iflag: t IFLAG: 1->0 via branch
1731 t interrupts become disabled IF=1 -> IF=0
1733 Dt interrupts become enabled IF=0 -> IF=1
1735 The intel-pt-events.py script illustrates how to access Event Trace information
1740 -----------
1744 perf record -e intel_pt/notnt/u uname
1746 In that case the --itrace q option is forced because walking executable code
1751 ----------------
1827 $ gcc -Wall -Wextra -O3 -g -o eg_ptw eg_ptw.c
1828 $ perf record -e intel_pt//u ./eg_ptw 0x1234567890abcdef
1831 $ perf script --itrace=ew
1837 ---------
1849 - full-trace, system wide : when buffer passes watermark
1850 - full-trace, not system-wide : when buffer passes watermark or
1852 - snapshot mode : as above but also when a snapshot is made
1853 - sample mode : as above but also when a sample is made
1855 That means finished-round ordering doesn't work. An auxtrace buffer
1867 -------
1875 --------
1877 linkperf:perf-record[1], linkperf:perf-script[1], linkperf:perf-report[1],
1878 linkperf:perf-inject[1]