1perf-list(1) 2============ 3 4NAME 5---- 6perf-list - List all symbolic event types 7 8SYNOPSIS 9-------- 10[verse] 11'perf list' [--no-desc] [--long-desc] 12 [hw|sw|cache|tracepoint|pmu|sdt|metric|metricgroup|event_glob] 13 14DESCRIPTION 15----------- 16This command displays the symbolic event types which can be selected in the 17various perf commands with the -e option. 18 19OPTIONS 20------- 21-d:: 22--desc:: 23Print extra event descriptions. (default) 24 25--no-desc:: 26Don't print descriptions. 27 28-v:: 29--long-desc:: 30Print longer event descriptions. 31 32--debug:: 33Enable debugging output. 34 35--details:: 36Print how named events are resolved internally into perf events, and also 37any extra expressions computed by perf stat. 38 39--deprecated:: 40Print deprecated events. By default the deprecated events are hidden. 41 42--unit:: 43Print PMU events and metrics limited to the specific PMU name. 44(e.g. --unit cpu, --unit msr, --unit cpu_core, --unit cpu_atom) 45 46-j:: 47--json:: 48Output in JSON format. 49 50[[EVENT_MODIFIERS]] 51EVENT MODIFIERS 52--------------- 53 54Events can optionally have a modifier by appending a colon and one or 55more modifiers. Modifiers allow the user to restrict the events to be 56counted. The following modifiers exist: 57 58 u - user-space counting 59 k - kernel counting 60 h - hypervisor counting 61 I - non idle counting 62 G - guest counting (in KVM guests) 63 H - host counting (not in KVM guests) 64 p - precise level 65 P - use maximum detected precise level 66 S - read sample value (PERF_SAMPLE_READ) 67 D - pin the event to the PMU 68 W - group is weak and will fallback to non-group if not schedulable, 69 e - group or event are exclusive and do not share the PMU 70 b - use BPF aggregration (see perf stat --bpf-counters) 71 72The 'p' modifier can be used for specifying how precise the instruction 73address should be. The 'p' modifier can be specified multiple times: 74 75 0 - SAMPLE_IP can have arbitrary skid 76 1 - SAMPLE_IP must have constant skid 77 2 - SAMPLE_IP requested to have 0 skid 78 3 - SAMPLE_IP must have 0 skid, or uses randomization to avoid 79 sample shadowing effects. 80 81For Intel systems precise event sampling is implemented with PEBS 82which supports up to precise-level 2, and precise level 3 for 83some special cases 84 85On AMD systems it is implemented using IBS (up to precise-level 2). 86The precise modifier works with event types 0x76 (cpu-cycles, CPU 87clocks not halted) and 0xC1 (micro-ops retired). Both events map to 88IBS execution sampling (IBS op) with the IBS Op Counter Control bit 89(IbsOpCntCtl) set respectively (see the 90Core Complex (CCX) -> Processor x86 Core -> Instruction Based Sampling (IBS) 91section of the [AMD Processor Programming Reference (PPR)] relevant to the 92family, model and stepping of the processor being used). 93 94Manual Volume 2: System Programming, 13.3 Instruction-Based 95Sampling). Examples to use IBS: 96 97 perf record -a -e cpu-cycles:p ... # use ibs op counting cycles 98 perf record -a -e r076:p ... # same as -e cpu-cycles:p 99 perf record -a -e r0C1:p ... # use ibs op counting micro-ops 100 101RAW HARDWARE EVENT DESCRIPTOR 102----------------------------- 103Even when an event is not available in a symbolic form within perf right now, 104it can be encoded in a per processor specific way. 105 106For instance on x86 CPUs, N is a hexadecimal value that represents the raw register encoding with the 107layout of IA32_PERFEVTSELx MSRs (see [Intel® 64 and IA-32 Architectures Software Developer's Manual Volume 3B: System Programming Guide] Figure 30-1 Layout 108of IA32_PERFEVTSELx MSRs) or AMD's PERF_CTL MSRs (see the 109Core Complex (CCX) -> Processor x86 Core -> MSR Registers section of the 110[AMD Processor Programming Reference (PPR)] relevant to the family, model 111and stepping of the processor being used). 112 113Note: Only the following bit fields can be set in x86 counter 114registers: event, umask, edge, inv, cmask. Esp. guest/host only and 115OS/user mode flags must be setup using <<EVENT_MODIFIERS, EVENT 116MODIFIERS>>. 117 118Example: 119 120If the Intel docs for a QM720 Core i7 describe an event as: 121 122 Event Umask Event Mask 123 Num. Value Mnemonic Description Comment 124 125 A8H 01H LSD.UOPS Counts the number of micro-ops Use cmask=1 and 126 delivered by loop stream detector invert to count 127 cycles 128 129raw encoding of 0x1A8 can be used: 130 131 perf stat -e r1a8 -a sleep 1 132 perf record -e r1a8 ... 133 134It's also possible to use pmu syntax: 135 136 perf record -e r1a8 -a sleep 1 137 perf record -e cpu/r1a8/ ... 138 perf record -e cpu/r0x1a8/ ... 139 140Some processors, like those from AMD, support event codes and unit masks 141larger than a byte. In such cases, the bits corresponding to the event 142configuration parameters can be seen with: 143 144 cat /sys/bus/event_source/devices/<pmu>/format/<config> 145 146Example: 147 148If the AMD docs for an EPYC 7713 processor describe an event as: 149 150 Event Umask Event Mask 151 Num. Value Mnemonic Description 152 153 28FH 03H op_cache_hit_miss.op_cache_hit Counts Op Cache micro-tag 154 hit events. 155 156raw encoding of 0x0328F cannot be used since the upper nibble of the 157EventSelect bits have to be specified via bits 32-35 as can be seen with: 158 159 cat /sys/bus/event_source/devices/cpu/format/event 160 161raw encoding of 0x20000038F should be used instead: 162 163 perf stat -e r20000038f -a sleep 1 164 perf record -e r20000038f ... 165 166It's also possible to use pmu syntax: 167 168 perf record -e r20000038f -a sleep 1 169 perf record -e cpu/r20000038f/ ... 170 perf record -e cpu/r0x20000038f/ ... 171 172You should refer to the processor specific documentation for getting these 173details. Some of them are referenced in the SEE ALSO section below. 174 175ARBITRARY PMUS 176-------------- 177 178perf also supports an extended syntax for specifying raw parameters 179to PMUs. Using this typically requires looking up the specific event 180in the CPU vendor specific documentation. 181 182The available PMUs and their raw parameters can be listed with 183 184 ls /sys/devices/*/format 185 186For example the raw event "LSD.UOPS" core pmu event above could 187be specified as 188 189 perf stat -e cpu/event=0xa8,umask=0x1,name=LSD.UOPS_CYCLES,cmask=0x1/ ... 190 191 or using extended name syntax 192 193 perf stat -e cpu/event=0xa8,umask=0x1,cmask=0x1,name=\'LSD.UOPS_CYCLES:cmask=0x1\'/ ... 194 195PER SOCKET PMUS 196--------------- 197 198Some PMUs are not associated with a core, but with a whole CPU socket. 199Events on these PMUs generally cannot be sampled, but only counted globally 200with perf stat -a. They can be bound to one logical CPU, but will measure 201all the CPUs in the same socket. 202 203This example measures memory bandwidth every second 204on the first memory controller on socket 0 of a Intel Xeon system 205 206 perf stat -C 0 -a uncore_imc_0/cas_count_read/,uncore_imc_0/cas_count_write/ -I 1000 ... 207 208Each memory controller has its own PMU. Measuring the complete system 209bandwidth would require specifying all imc PMUs (see perf list output), 210and adding the values together. To simplify creation of multiple events, 211prefix and glob matching is supported in the PMU name, and the prefix 212'uncore_' is also ignored when performing the match. So the command above 213can be expanded to all memory controllers by using the syntaxes: 214 215 perf stat -C 0 -a imc/cas_count_read/,imc/cas_count_write/ -I 1000 ... 216 perf stat -C 0 -a *imc*/cas_count_read/,*imc*/cas_count_write/ -I 1000 ... 217 218This example measures the combined core power every second 219 220 perf stat -I 1000 -e power/energy-cores/ -a 221 222ACCESS RESTRICTIONS 223------------------- 224 225For non root users generally only context switched PMU events are available. 226This is normally only the events in the cpu PMU, the predefined events 227like cycles and instructions and some software events. 228 229Other PMUs and global measurements are normally root only. 230Some event qualifiers, such as "any", are also root only. 231 232This can be overridden by setting the kernel.perf_event_paranoid 233sysctl to -1, which allows non root to use these events. 234 235For accessing trace point events perf needs to have read access to 236/sys/kernel/tracing, even when perf_event_paranoid is in a relaxed 237setting. 238 239TRACING 240------- 241 242Some PMUs control advanced hardware tracing capabilities, such as Intel PT, 243that allows low overhead execution tracing. These are described in a separate 244intel-pt.txt document. 245 246PARAMETERIZED EVENTS 247-------------------- 248 249Some pmu events listed by 'perf-list' will be displayed with '?' in them. For 250example: 251 252 hv_gpci/dtbp_ptitc,phys_processor_idx=?/ 253 254This means that when provided as an event, a value for '?' must 255also be supplied. For example: 256 257 perf stat -C 0 -e 'hv_gpci/dtbp_ptitc,phys_processor_idx=0x2/' ... 258 259EVENT QUALIFIERS: 260 261It is also possible to add extra qualifiers to an event: 262 263percore: 264 265Sums up the event counts for all hardware threads in a core, e.g.: 266 267 268 perf stat -e cpu/event=0,umask=0x3,percore=1/ 269 270 271EVENT GROUPS 272------------ 273 274Perf supports time based multiplexing of events, when the number of events 275active exceeds the number of hardware performance counters. Multiplexing 276can cause measurement errors when the workload changes its execution 277profile. 278 279When metrics are computed using formulas from event counts, it is useful to 280ensure some events are always measured together as a group to minimize multiplexing 281errors. Event groups can be specified using { }. 282 283 perf stat -e '{instructions,cycles}' ... 284 285The number of available performance counters depend on the CPU. A group 286cannot contain more events than available counters. 287For example Intel Core CPUs typically have four generic performance counters 288for the core, plus three fixed counters for instructions, cycles and 289ref-cycles. Some special events have restrictions on which counter they 290can schedule, and may not support multiple instances in a single group. 291When too many events are specified in the group some of them will not 292be measured. 293 294Globally pinned events can limit the number of counters available for 295other groups. On x86 systems, the NMI watchdog pins a counter by default. 296The nmi watchdog can be disabled as root with 297 298 echo 0 > /proc/sys/kernel/nmi_watchdog 299 300Events from multiple different PMUs cannot be mixed in a group, with 301some exceptions for software events. 302 303LEADER SAMPLING 304--------------- 305 306perf also supports group leader sampling using the :S specifier. 307 308 perf record -e '{cycles,instructions}:S' ... 309 perf report --group 310 311Normally all events in an event group sample, but with :S only 312the first event (the leader) samples, and it only reads the values of the 313other events in the group. 314 315However, in the case AUX area events (e.g. Intel PT or CoreSight), the AUX 316area event must be the leader, so then the second event samples, not the first. 317 318OPTIONS 319------- 320 321Without options all known events will be listed. 322 323To limit the list use: 324 325. 'hw' or 'hardware' to list hardware events such as cache-misses, etc. 326 327. 'sw' or 'software' to list software events such as context switches, etc. 328 329. 'cache' or 'hwcache' to list hardware cache events such as L1-dcache-loads, etc. 330 331. 'tracepoint' to list all tracepoint events, alternatively use 332 'subsys_glob:event_glob' to filter by tracepoint subsystems such as sched, 333 block, etc. 334 335. 'pmu' to print the kernel supplied PMU events. 336 337. 'sdt' to list all Statically Defined Tracepoint events. 338 339. 'metric' to list metrics 340 341. 'metricgroup' to list metricgroups with metrics. 342 343. If none of the above is matched, it will apply the supplied glob to all 344 events, printing the ones that match. 345 346. As a last resort, it will do a substring search in all event names. 347 348One or more types can be used at the same time, listing the events for the 349types specified. 350 351Support raw format: 352 353. '--raw-dump', shows the raw-dump of all the events. 354. '--raw-dump [hw|sw|cache|tracepoint|pmu|event_glob]', shows the raw-dump of 355 a certain kind of events. 356 357SEE ALSO 358-------- 359linkperf:perf-stat[1], linkperf:perf-top[1], 360linkperf:perf-record[1], 361http://www.intel.com/sdm/[Intel® 64 and IA-32 Architectures Software Developer's Manual Volume 3B: System Programming Guide], 362https://bugzilla.kernel.org/show_bug.cgi?id=206537[AMD Processor Programming Reference (PPR)] 363