1perf-list(1) 2============ 3 4NAME 5---- 6perf-list - List all symbolic event types 7 8SYNOPSIS 9-------- 10[verse] 11'perf list' [--no-desc] [--long-desc] [hw|sw|cache|tracepoint|pmu|sdt|event_glob] 12 13DESCRIPTION 14----------- 15This command displays the symbolic event types which can be selected in the 16various perf commands with the -e option. 17 18OPTIONS 19------- 20--no-desc:: 21Don't print descriptions. 22 23-v:: 24--long-desc:: 25Print longer event descriptions. 26 27--details:: 28Print how named events are resolved internally into perf events, and also 29any extra expressions computed by perf stat. 30 31 32[[EVENT_MODIFIERS]] 33EVENT MODIFIERS 34--------------- 35 36Events can optionally have a modifier by appending a colon and one or 37more modifiers. Modifiers allow the user to restrict the events to be 38counted. The following modifiers exist: 39 40 u - user-space counting 41 k - kernel counting 42 h - hypervisor counting 43 I - non idle counting 44 G - guest counting (in KVM guests) 45 H - host counting (not in KVM guests) 46 p - precise level 47 P - use maximum detected precise level 48 S - read sample value (PERF_SAMPLE_READ) 49 D - pin the event to the PMU 50 51The 'p' modifier can be used for specifying how precise the instruction 52address should be. The 'p' modifier can be specified multiple times: 53 54 0 - SAMPLE_IP can have arbitrary skid 55 1 - SAMPLE_IP must have constant skid 56 2 - SAMPLE_IP requested to have 0 skid 57 3 - SAMPLE_IP must have 0 skid, or uses randomization to avoid 58 sample shadowing effects. 59 60For Intel systems precise event sampling is implemented with PEBS 61which supports up to precise-level 2, and precise level 3 for 62some special cases 63 64On AMD systems it is implemented using IBS (up to precise-level 2). 65The precise modifier works with event types 0x76 (cpu-cycles, CPU 66clocks not halted) and 0xC1 (micro-ops retired). Both events map to 67IBS execution sampling (IBS op) with the IBS Op Counter Control bit 68(IbsOpCntCtl) set respectively (see AMD64 Architecture Programmer’s 69Manual Volume 2: System Programming, 13.3 Instruction-Based 70Sampling). Examples to use IBS: 71 72 perf record -a -e cpu-cycles:p ... # use ibs op counting cycles 73 perf record -a -e r076:p ... # same as -e cpu-cycles:p 74 perf record -a -e r0C1:p ... # use ibs op counting micro-ops 75 76RAW HARDWARE EVENT DESCRIPTOR 77----------------------------- 78Even when an event is not available in a symbolic form within perf right now, 79it can be encoded in a per processor specific way. 80 81For instance For x86 CPUs NNN represents the raw register encoding with the 82layout of IA32_PERFEVTSELx MSRs (see [Intel® 64 and IA-32 Architectures Software Developer's Manual Volume 3B: System Programming Guide] Figure 30-1 Layout 83of IA32_PERFEVTSELx MSRs) or AMD's PerfEvtSeln (see [AMD64 Architecture Programmer’s Manual Volume 2: System Programming], Page 344, 84Figure 13-7 Performance Event-Select Register (PerfEvtSeln)). 85 86Note: Only the following bit fields can be set in x86 counter 87registers: event, umask, edge, inv, cmask. Esp. guest/host only and 88OS/user mode flags must be setup using <<EVENT_MODIFIERS, EVENT 89MODIFIERS>>. 90 91Example: 92 93If the Intel docs for a QM720 Core i7 describe an event as: 94 95 Event Umask Event Mask 96 Num. Value Mnemonic Description Comment 97 98 A8H 01H LSD.UOPS Counts the number of micro-ops Use cmask=1 and 99 delivered by loop stream detector invert to count 100 cycles 101 102raw encoding of 0x1A8 can be used: 103 104 perf stat -e r1a8 -a sleep 1 105 perf record -e r1a8 ... 106 107You should refer to the processor specific documentation for getting these 108details. Some of them are referenced in the SEE ALSO section below. 109 110ARBITRARY PMUS 111-------------- 112 113perf also supports an extended syntax for specifying raw parameters 114to PMUs. Using this typically requires looking up the specific event 115in the CPU vendor specific documentation. 116 117The available PMUs and their raw parameters can be listed with 118 119 ls /sys/devices/*/format 120 121For example the raw event "LSD.UOPS" core pmu event above could 122be specified as 123 124 perf stat -e cpu/event=0xa8,umask=0x1,name=LSD.UOPS_CYCLES,cmask=1/ ... 125 126PER SOCKET PMUS 127--------------- 128 129Some PMUs are not associated with a core, but with a whole CPU socket. 130Events on these PMUs generally cannot be sampled, but only counted globally 131with perf stat -a. They can be bound to one logical CPU, but will measure 132all the CPUs in the same socket. 133 134This example measures memory bandwidth every second 135on the first memory controller on socket 0 of a Intel Xeon system 136 137 perf stat -C 0 -a uncore_imc_0/cas_count_read/,uncore_imc_0/cas_count_write/ -I 1000 ... 138 139Each memory controller has its own PMU. Measuring the complete system 140bandwidth would require specifying all imc PMUs (see perf list output), 141and adding the values together. 142 143This example measures the combined core power every second 144 145 perf stat -I 1000 -e power/energy-cores/ -a 146 147ACCESS RESTRICTIONS 148------------------- 149 150For non root users generally only context switched PMU events are available. 151This is normally only the events in the cpu PMU, the predefined events 152like cycles and instructions and some software events. 153 154Other PMUs and global measurements are normally root only. 155Some event qualifiers, such as "any", are also root only. 156 157This can be overriden by setting the kernel.perf_event_paranoid 158sysctl to -1, which allows non root to use these events. 159 160For accessing trace point events perf needs to have read access to 161/sys/kernel/debug/tracing, even when perf_event_paranoid is in a relaxed 162setting. 163 164TRACING 165------- 166 167Some PMUs control advanced hardware tracing capabilities, such as Intel PT, 168that allows low overhead execution tracing. These are described in a separate 169intel-pt.txt document. 170 171PARAMETERIZED EVENTS 172-------------------- 173 174Some pmu events listed by 'perf-list' will be displayed with '?' in them. For 175example: 176 177 hv_gpci/dtbp_ptitc,phys_processor_idx=?/ 178 179This means that when provided as an event, a value for '?' must 180also be supplied. For example: 181 182 perf stat -C 0 -e 'hv_gpci/dtbp_ptitc,phys_processor_idx=0x2/' ... 183 184EVENT GROUPS 185------------ 186 187Perf supports time based multiplexing of events, when the number of events 188active exceeds the number of hardware performance counters. Multiplexing 189can cause measurement errors when the workload changes its execution 190profile. 191 192When metrics are computed using formulas from event counts, it is useful to 193ensure some events are always measured together as a group to minimize multiplexing 194errors. Event groups can be specified using { }. 195 196 perf stat -e '{instructions,cycles}' ... 197 198The number of available performance counters depend on the CPU. A group 199cannot contain more events than available counters. 200For example Intel Core CPUs typically have four generic performance counters 201for the core, plus three fixed counters for instructions, cycles and 202ref-cycles. Some special events have restrictions on which counter they 203can schedule, and may not support multiple instances in a single group. 204When too many events are specified in the group none of them will not 205be measured. 206 207Globally pinned events can limit the number of counters available for 208other groups. On x86 systems, the NMI watchdog pins a counter by default. 209The nmi watchdog can be disabled as root with 210 211 echo 0 > /proc/sys/kernel/nmi_watchdog 212 213Events from multiple different PMUs cannot be mixed in a group, with 214some exceptions for software events. 215 216LEADER SAMPLING 217--------------- 218 219perf also supports group leader sampling using the :S specifier. 220 221 perf record -e '{cycles,instructions}:S' ... 222 perf report --group 223 224Normally all events in a event group sample, but with :S only 225the first event (the leader) samples, and it only reads the values of the 226other events in the group. 227 228OPTIONS 229------- 230 231Without options all known events will be listed. 232 233To limit the list use: 234 235. 'hw' or 'hardware' to list hardware events such as cache-misses, etc. 236 237. 'sw' or 'software' to list software events such as context switches, etc. 238 239. 'cache' or 'hwcache' to list hardware cache events such as L1-dcache-loads, etc. 240 241. 'tracepoint' to list all tracepoint events, alternatively use 242 'subsys_glob:event_glob' to filter by tracepoint subsystems such as sched, 243 block, etc. 244 245. 'pmu' to print the kernel supplied PMU events. 246 247. 'sdt' to list all Statically Defined Tracepoint events. 248 249. If none of the above is matched, it will apply the supplied glob to all 250 events, printing the ones that match. 251 252. As a last resort, it will do a substring search in all event names. 253 254One or more types can be used at the same time, listing the events for the 255types specified. 256 257Support raw format: 258 259. '--raw-dump', shows the raw-dump of all the events. 260. '--raw-dump [hw|sw|cache|tracepoint|pmu|event_glob]', shows the raw-dump of 261 a certain kind of events. 262 263SEE ALSO 264-------- 265linkperf:perf-stat[1], linkperf:perf-top[1], 266linkperf:perf-record[1], 267http://www.intel.com/sdm/[Intel® 64 and IA-32 Architectures Software Developer's Manual Volume 3B: System Programming Guide], 268http://support.amd.com/us/Processor_TechDocs/24593_APM_v2.pdf[AMD64 Architecture Programmer’s Manual Volume 2: System Programming] 269