1perf-list(1) 2============ 3 4NAME 5---- 6perf-list - List all symbolic event types 7 8SYNOPSIS 9-------- 10[verse] 11'perf list' [--no-desc] [--long-desc] 12 [hw|sw|cache|tracepoint|pmu|sdt|metric|metricgroup|event_glob] 13 14DESCRIPTION 15----------- 16This command displays the symbolic event types which can be selected in the 17various perf commands with the -e option. 18 19OPTIONS 20------- 21-d:: 22--desc:: 23Print extra event descriptions. (default) 24 25--no-desc:: 26Don't print descriptions. 27 28-v:: 29--long-desc:: 30Print longer event descriptions. 31 32--debug:: 33Enable debugging output. 34 35--details:: 36Print how named events are resolved internally into perf events, and also 37any extra expressions computed by perf stat. 38 39--deprecated:: 40Print deprecated events. By default the deprecated events are hidden. 41 42--unit:: 43Print PMU events and metrics limited to the specific PMU name. 44(e.g. --unit cpu, --unit msr, --unit cpu_core, --unit cpu_atom) 45 46-j:: 47--json:: 48Output in JSON format. 49 50[[EVENT_MODIFIERS]] 51EVENT MODIFIERS 52--------------- 53 54Events can optionally have a modifier by appending a colon and one or 55more modifiers. Modifiers allow the user to restrict the events to be 56counted. The following modifiers exist: 57 58 u - user-space counting 59 k - kernel counting 60 h - hypervisor counting 61 I - non idle counting 62 G - guest counting (in KVM guests) 63 H - host counting (not in KVM guests) 64 p - precise level 65 P - use maximum detected precise level 66 S - read sample value (PERF_SAMPLE_READ) 67 D - pin the event to the PMU 68 W - group is weak and will fallback to non-group if not schedulable, 69 e - group or event are exclusive and do not share the PMU 70 71The 'p' modifier can be used for specifying how precise the instruction 72address should be. The 'p' modifier can be specified multiple times: 73 74 0 - SAMPLE_IP can have arbitrary skid 75 1 - SAMPLE_IP must have constant skid 76 2 - SAMPLE_IP requested to have 0 skid 77 3 - SAMPLE_IP must have 0 skid, or uses randomization to avoid 78 sample shadowing effects. 79 80For Intel systems precise event sampling is implemented with PEBS 81which supports up to precise-level 2, and precise level 3 for 82some special cases 83 84On AMD systems it is implemented using IBS (up to precise-level 2). 85The precise modifier works with event types 0x76 (cpu-cycles, CPU 86clocks not halted) and 0xC1 (micro-ops retired). Both events map to 87IBS execution sampling (IBS op) with the IBS Op Counter Control bit 88(IbsOpCntCtl) set respectively (see the 89Core Complex (CCX) -> Processor x86 Core -> Instruction Based Sampling (IBS) 90section of the [AMD Processor Programming Reference (PPR)] relevant to the 91family, model and stepping of the processor being used). 92 93Manual Volume 2: System Programming, 13.3 Instruction-Based 94Sampling). Examples to use IBS: 95 96 perf record -a -e cpu-cycles:p ... # use ibs op counting cycles 97 perf record -a -e r076:p ... # same as -e cpu-cycles:p 98 perf record -a -e r0C1:p ... # use ibs op counting micro-ops 99 100RAW HARDWARE EVENT DESCRIPTOR 101----------------------------- 102Even when an event is not available in a symbolic form within perf right now, 103it can be encoded in a per processor specific way. 104 105For instance on x86 CPUs, N is a hexadecimal value that represents the raw register encoding with the 106layout of IA32_PERFEVTSELx MSRs (see [Intel® 64 and IA-32 Architectures Software Developer's Manual Volume 3B: System Programming Guide] Figure 30-1 Layout 107of IA32_PERFEVTSELx MSRs) or AMD's PERF_CTL MSRs (see the 108Core Complex (CCX) -> Processor x86 Core -> MSR Registers section of the 109[AMD Processor Programming Reference (PPR)] relevant to the family, model 110and stepping of the processor being used). 111 112Note: Only the following bit fields can be set in x86 counter 113registers: event, umask, edge, inv, cmask. Esp. guest/host only and 114OS/user mode flags must be setup using <<EVENT_MODIFIERS, EVENT 115MODIFIERS>>. 116 117Example: 118 119If the Intel docs for a QM720 Core i7 describe an event as: 120 121 Event Umask Event Mask 122 Num. Value Mnemonic Description Comment 123 124 A8H 01H LSD.UOPS Counts the number of micro-ops Use cmask=1 and 125 delivered by loop stream detector invert to count 126 cycles 127 128raw encoding of 0x1A8 can be used: 129 130 perf stat -e r1a8 -a sleep 1 131 perf record -e r1a8 ... 132 133It's also possible to use pmu syntax: 134 135 perf record -e r1a8 -a sleep 1 136 perf record -e cpu/r1a8/ ... 137 perf record -e cpu/r0x1a8/ ... 138 139Some processors, like those from AMD, support event codes and unit masks 140larger than a byte. In such cases, the bits corresponding to the event 141configuration parameters can be seen with: 142 143 cat /sys/bus/event_source/devices/<pmu>/format/<config> 144 145Example: 146 147If the AMD docs for an EPYC 7713 processor describe an event as: 148 149 Event Umask Event Mask 150 Num. Value Mnemonic Description 151 152 28FH 03H op_cache_hit_miss.op_cache_hit Counts Op Cache micro-tag 153 hit events. 154 155raw encoding of 0x0328F cannot be used since the upper nibble of the 156EventSelect bits have to be specified via bits 32-35 as can be seen with: 157 158 cat /sys/bus/event_source/devices/cpu/format/event 159 160raw encoding of 0x20000038F should be used instead: 161 162 perf stat -e r20000038f -a sleep 1 163 perf record -e r20000038f ... 164 165It's also possible to use pmu syntax: 166 167 perf record -e r20000038f -a sleep 1 168 perf record -e cpu/r20000038f/ ... 169 perf record -e cpu/r0x20000038f/ ... 170 171You should refer to the processor specific documentation for getting these 172details. Some of them are referenced in the SEE ALSO section below. 173 174ARBITRARY PMUS 175-------------- 176 177perf also supports an extended syntax for specifying raw parameters 178to PMUs. Using this typically requires looking up the specific event 179in the CPU vendor specific documentation. 180 181The available PMUs and their raw parameters can be listed with 182 183 ls /sys/devices/*/format 184 185For example the raw event "LSD.UOPS" core pmu event above could 186be specified as 187 188 perf stat -e cpu/event=0xa8,umask=0x1,name=LSD.UOPS_CYCLES,cmask=0x1/ ... 189 190 or using extended name syntax 191 192 perf stat -e cpu/event=0xa8,umask=0x1,cmask=0x1,name=\'LSD.UOPS_CYCLES:cmask=0x1\'/ ... 193 194PER SOCKET PMUS 195--------------- 196 197Some PMUs are not associated with a core, but with a whole CPU socket. 198Events on these PMUs generally cannot be sampled, but only counted globally 199with perf stat -a. They can be bound to one logical CPU, but will measure 200all the CPUs in the same socket. 201 202This example measures memory bandwidth every second 203on the first memory controller on socket 0 of a Intel Xeon system 204 205 perf stat -C 0 -a uncore_imc_0/cas_count_read/,uncore_imc_0/cas_count_write/ -I 1000 ... 206 207Each memory controller has its own PMU. Measuring the complete system 208bandwidth would require specifying all imc PMUs (see perf list output), 209and adding the values together. To simplify creation of multiple events, 210prefix and glob matching is supported in the PMU name, and the prefix 211'uncore_' is also ignored when performing the match. So the command above 212can be expanded to all memory controllers by using the syntaxes: 213 214 perf stat -C 0 -a imc/cas_count_read/,imc/cas_count_write/ -I 1000 ... 215 perf stat -C 0 -a *imc*/cas_count_read/,*imc*/cas_count_write/ -I 1000 ... 216 217This example measures the combined core power every second 218 219 perf stat -I 1000 -e power/energy-cores/ -a 220 221ACCESS RESTRICTIONS 222------------------- 223 224For non root users generally only context switched PMU events are available. 225This is normally only the events in the cpu PMU, the predefined events 226like cycles and instructions and some software events. 227 228Other PMUs and global measurements are normally root only. 229Some event qualifiers, such as "any", are also root only. 230 231This can be overridden by setting the kernel.perf_event_paranoid 232sysctl to -1, which allows non root to use these events. 233 234For accessing trace point events perf needs to have read access to 235/sys/kernel/debug/tracing, even when perf_event_paranoid is in a relaxed 236setting. 237 238TRACING 239------- 240 241Some PMUs control advanced hardware tracing capabilities, such as Intel PT, 242that allows low overhead execution tracing. These are described in a separate 243intel-pt.txt document. 244 245PARAMETERIZED EVENTS 246-------------------- 247 248Some pmu events listed by 'perf-list' will be displayed with '?' in them. For 249example: 250 251 hv_gpci/dtbp_ptitc,phys_processor_idx=?/ 252 253This means that when provided as an event, a value for '?' must 254also be supplied. For example: 255 256 perf stat -C 0 -e 'hv_gpci/dtbp_ptitc,phys_processor_idx=0x2/' ... 257 258EVENT QUALIFIERS: 259 260It is also possible to add extra qualifiers to an event: 261 262percore: 263 264Sums up the event counts for all hardware threads in a core, e.g.: 265 266 267 perf stat -e cpu/event=0,umask=0x3,percore=1/ 268 269 270EVENT GROUPS 271------------ 272 273Perf supports time based multiplexing of events, when the number of events 274active exceeds the number of hardware performance counters. Multiplexing 275can cause measurement errors when the workload changes its execution 276profile. 277 278When metrics are computed using formulas from event counts, it is useful to 279ensure some events are always measured together as a group to minimize multiplexing 280errors. Event groups can be specified using { }. 281 282 perf stat -e '{instructions,cycles}' ... 283 284The number of available performance counters depend on the CPU. A group 285cannot contain more events than available counters. 286For example Intel Core CPUs typically have four generic performance counters 287for the core, plus three fixed counters for instructions, cycles and 288ref-cycles. Some special events have restrictions on which counter they 289can schedule, and may not support multiple instances in a single group. 290When too many events are specified in the group some of them will not 291be measured. 292 293Globally pinned events can limit the number of counters available for 294other groups. On x86 systems, the NMI watchdog pins a counter by default. 295The nmi watchdog can be disabled as root with 296 297 echo 0 > /proc/sys/kernel/nmi_watchdog 298 299Events from multiple different PMUs cannot be mixed in a group, with 300some exceptions for software events. 301 302LEADER SAMPLING 303--------------- 304 305perf also supports group leader sampling using the :S specifier. 306 307 perf record -e '{cycles,instructions}:S' ... 308 perf report --group 309 310Normally all events in an event group sample, but with :S only 311the first event (the leader) samples, and it only reads the values of the 312other events in the group. 313 314However, in the case AUX area events (e.g. Intel PT or CoreSight), the AUX 315area event must be the leader, so then the second event samples, not the first. 316 317OPTIONS 318------- 319 320Without options all known events will be listed. 321 322To limit the list use: 323 324. 'hw' or 'hardware' to list hardware events such as cache-misses, etc. 325 326. 'sw' or 'software' to list software events such as context switches, etc. 327 328. 'cache' or 'hwcache' to list hardware cache events such as L1-dcache-loads, etc. 329 330. 'tracepoint' to list all tracepoint events, alternatively use 331 'subsys_glob:event_glob' to filter by tracepoint subsystems such as sched, 332 block, etc. 333 334. 'pmu' to print the kernel supplied PMU events. 335 336. 'sdt' to list all Statically Defined Tracepoint events. 337 338. 'metric' to list metrics 339 340. 'metricgroup' to list metricgroups with metrics. 341 342. If none of the above is matched, it will apply the supplied glob to all 343 events, printing the ones that match. 344 345. As a last resort, it will do a substring search in all event names. 346 347One or more types can be used at the same time, listing the events for the 348types specified. 349 350Support raw format: 351 352. '--raw-dump', shows the raw-dump of all the events. 353. '--raw-dump [hw|sw|cache|tracepoint|pmu|event_glob]', shows the raw-dump of 354 a certain kind of events. 355 356SEE ALSO 357-------- 358linkperf:perf-stat[1], linkperf:perf-top[1], 359linkperf:perf-record[1], 360http://www.intel.com/sdm/[Intel® 64 and IA-32 Architectures Software Developer's Manual Volume 3B: System Programming Guide], 361https://bugzilla.kernel.org/show_bug.cgi?id=206537[AMD Processor Programming Reference (PPR)] 362