xref: /openbmc/linux/tools/perf/Documentation/perf-record.txt (revision c900529f3d9161bfde5cca0754f83b4d3c3e0220)
186470930SIngo Molnarperf-record(1)
286470930SIngo Molnar==============
386470930SIngo Molnar
486470930SIngo MolnarNAME
586470930SIngo Molnar----
686470930SIngo Molnarperf-record - Run a command and record its profile into perf.data
786470930SIngo Molnar
886470930SIngo MolnarSYNOPSIS
986470930SIngo Molnar--------
1086470930SIngo Molnar[verse]
113f50f614STaeung Song'perf record' [-e <EVENT> | --event=EVENT] [-a] <command>
12f2c24ebaSAlyssa Ross'perf record' [-e <EVENT> | --event=EVENT] [-a] \-- <command> [<options>]
1386470930SIngo Molnar
1486470930SIngo MolnarDESCRIPTION
1586470930SIngo Molnar-----------
1686470930SIngo MolnarThis command runs a command and gathers a performance counter profile
1786470930SIngo Molnarfrom it, into perf.data - without displaying anything.
1886470930SIngo Molnar
1986470930SIngo MolnarThis file can then be inspected later on, using 'perf report'.
2086470930SIngo Molnar
2186470930SIngo Molnar
2286470930SIngo MolnarOPTIONS
2386470930SIngo Molnar-------
2486470930SIngo Molnar<command>...::
2586470930SIngo Molnar	Any command you can specify in a shell.
2686470930SIngo Molnar
2786470930SIngo Molnar-e::
2886470930SIngo Molnar--event=::
291b290d67SFrederic Weisbecker	Select the PMU event. Selection can be:
301b290d67SFrederic Weisbecker
311b290d67SFrederic Weisbecker        - a symbolic event name	(use 'perf list' to list all events)
321b290d67SFrederic Weisbecker
334edb117eSSandipan Das        - a raw PMU event in the form of rN where N is a hexadecimal value
344edb117eSSandipan Das          that represents the raw register encoding with the layout of the
354edb117eSSandipan Das          event control registers as described by entries in
368db43088SIan Rogers          /sys/bus/event_source/devices/cpu/format/*.
3786470930SIngo Molnar
38e48a73a3SKim Phillips        - a symbolic or raw PMU event followed by an optional colon
39e48a73a3SKim Phillips	  and a list of event modifiers, e.g., cpu-cycles:p.  See the
40e48a73a3SKim Phillips	  linkperf:perf-list[1] man page for details on event modifiers.
41e48a73a3SKim Phillips
42f9ab9c19SCody P Schafer	- a symbolically formed PMU event like 'pmu/param1=0x3,param2/' where
43f9ab9c19SCody P Schafer	  'param1', 'param2', etc are defined as formats for the PMU in
44a9e57009SAdrian Hunter	  /sys/bus/event_source/devices/<pmu>/format/*.
45f9ab9c19SCody P Schafer
46f9ab9c19SCody P Schafer	- a symbolically formed event like 'pmu/config=M,config1=N,config3=K/'
47f9ab9c19SCody P Schafer
48f9ab9c19SCody P Schafer          where M, N, K are numbers (in decimal, hex, octal format). Acceptable
49f9ab9c19SCody P Schafer          values for each of 'config', 'config1' and 'config2' are defined by
50a9e57009SAdrian Hunter          corresponding entries in /sys/bus/event_source/devices/<pmu>/format/*
51f9ab9c19SCody P Schafer          param1 and param2 are defined as formats for the PMU in:
52a9e57009SAdrian Hunter          /sys/bus/event_source/devices/<pmu>/format/*
53f9ab9c19SCody P Schafer
5484ee74afSAndi Kleen	  There are also some parameters which are not defined in .../<pmu>/format/*.
55ee4c7588SJiri Olsa	  These params can be used to overload default config values per event.
5684ee74afSAndi Kleen	  Here are some common parameters:
573d5d68aaSKan Liang	  - 'period': Set event sampling period
5809af2a55SNamhyung Kim	  - 'freq': Set event sampling frequency
5932067712SKan Liang	  - 'time': Disable/enable time stamping. Acceptable values are 1 for
6032067712SKan Liang		    enabling time stamping. 0 for disabling time stamping.
6132067712SKan Liang		    The default is 1.
62d457c963SKan Liang	  - 'call-graph': Disable/enable callgraph. Acceptable str are "fp" for
63f9db0d0fSKan Liang			 FP mode, "dwarf" for DWARF mode, "lbr" for LBR mode and
64f9db0d0fSKan Liang			 "no" for disable callgraph.
65d457c963SKan Liang	  - 'stack-size': user stack size for dwarf mode
66f92da712SAlexey Budankov	  - 'name' : User defined event name. Single quotes (') may be used to
67f92da712SAlexey Budankov		    escape symbols in the name from parsing by shell and tool
68f92da712SAlexey Budankov		    like this: name=\'CPU_CLK_UNHALTED.THREAD:cmask=0x1\'.
691b992154SAdrian Hunter	  - 'aux-output': Generate AUX records instead of events. This requires
701b992154SAdrian Hunter			  that an AUX area event is also provided.
71eb7a52d4SAdrian Hunter	  - 'aux-sample-size': Set sample size for AUX area sampling. If the
72eb7a52d4SAdrian Hunter	  '--aux-sample' option has been used, set aux-sample-size=0 to disable
73eb7a52d4SAdrian Hunter	  AUX area sampling for the event.
7484ee74afSAndi Kleen
7584ee74afSAndi Kleen          See the linkperf:perf-list[1] man page for more parameters.
7684ee74afSAndi Kleen
773d5d68aaSKan Liang	  Note: If user explicitly sets options which conflict with the params,
7884ee74afSAndi Kleen	  the value set by the parameters will be overridden.
793d5d68aaSKan Liang
80dd60fba7SMathieu Poirier	  Also not defined in .../<pmu>/format/* are PMU driver specific
81dd60fba7SMathieu Poirier	  configuration parameters.  Any configuration parameter preceded by
82dd60fba7SMathieu Poirier	  the letter '@' is not interpreted in user space and sent down directly
83dd60fba7SMathieu Poirier	  to the PMU driver.  For example:
84dd60fba7SMathieu Poirier
85dd60fba7SMathieu Poirier	  perf record -e some_event/@cfg1,@cfg2=config/ ...
86dd60fba7SMathieu Poirier
87dd60fba7SMathieu Poirier	  will see 'cfg1' and 'cfg2=config' pushed to the PMU driver associated
88dd60fba7SMathieu Poirier	  with the event for further processing.  There is no restriction on
89dd60fba7SMathieu Poirier	  what the configuration parameters are, as long as their semantic is
90dd60fba7SMathieu Poirier	  understood and supported by the PMU driver.
91dd60fba7SMathieu Poirier
923741eb9fSJacob Shin        - a hardware breakpoint event in the form of '\mem:addr[/len][:access]'
931b290d67SFrederic Weisbecker          where addr is the address in memory you want to break in.
941b290d67SFrederic Weisbecker          Access is the memory access type (read, write, execute) it can
953741eb9fSJacob Shin          be passed as follows: '\mem:addr[:[r][w][x]]'. len is the range,
963741eb9fSJacob Shin          number of bytes from specified addr, which the breakpoint will cover.
971b290d67SFrederic Weisbecker          If you want to profile read-write accesses in 0x1000, just set
981b290d67SFrederic Weisbecker          'mem:0x1000:rw'.
993741eb9fSJacob Shin          If you want to profile write accesses in [0x1000~1008), just set
1003741eb9fSJacob Shin          'mem:0x1000/8:w'.
10108dbd7e3SShawn Bohrer
1029a75606cSNamhyung Kim	- a group of events surrounded by a pair of brace ("{event1,event2,...}").
1039a75606cSNamhyung Kim	  Each event is separated by commas and the group should be quoted to
1049a75606cSNamhyung Kim	  prevent the shell interpretation.  You also need to use --group on
1059a75606cSNamhyung Kim	  "perf report" to view group events together.
1069a75606cSNamhyung Kim
10708dbd7e3SShawn Bohrer--filter=<filter>::
108d180aa56SNamhyung Kim	Event filter.  This option should follow an event selector (-e).
109d180aa56SNamhyung Kim	If the event is a tracepoint, the filter string will be parsed by
110d180aa56SNamhyung Kim	the kernel.  If the event is a hardware trace PMU (e.g. Intel PT
111d180aa56SNamhyung Kim	or CoreSight), it'll be processed as an address filter.  Otherwise
112d180aa56SNamhyung Kim	it means a general filter using BPF which can be applied for any
113d180aa56SNamhyung Kim	kind of event.
1141b36c03eSAdrian Hunter
1151b36c03eSAdrian Hunter	- tracepoint filters
1161b36c03eSAdrian Hunter
1171b36c03eSAdrian Hunter	In the case of tracepoints, multiple '--filter' options are combined
1184ba1faa1SWang Nan	using '&&'.
1194ba1faa1SWang Nan
1201b36c03eSAdrian Hunter	- address filters
1211b36c03eSAdrian Hunter
1221b36c03eSAdrian Hunter	A hardware trace PMU advertises its ability to accept a number of
1231b36c03eSAdrian Hunter	address filters	by specifying a non-zero value in
1241b36c03eSAdrian Hunter	/sys/bus/event_source/devices/<pmu>/nr_addr_filters.
1251b36c03eSAdrian Hunter
1261b36c03eSAdrian Hunter	Address filters have the format:
1271b36c03eSAdrian Hunter
1281b36c03eSAdrian Hunter	filter|start|stop|tracestop <start> [/ <size>] [@<file name>]
1291b36c03eSAdrian Hunter
1301b36c03eSAdrian Hunter	Where:
1311b36c03eSAdrian Hunter	- 'filter': defines a region that will be traced.
1321b36c03eSAdrian Hunter	- 'start': defines an address at which tracing will begin.
1331b36c03eSAdrian Hunter	- 'stop': defines an address at which tracing will stop.
1341b36c03eSAdrian Hunter	- 'tracestop': defines a region in which tracing will stop.
1351b36c03eSAdrian Hunter
1361b36c03eSAdrian Hunter	<file name> is the name of the object file, <start> is the offset to the
1371b36c03eSAdrian Hunter	code to trace in that file, and <size> is the size of the region to
1381b36c03eSAdrian Hunter	trace. 'start' and 'stop' filters need not specify a <size>.
1391b36c03eSAdrian Hunter
1401b36c03eSAdrian Hunter	If no object file is specified then the kernel is assumed, in which case
1411b36c03eSAdrian Hunter	the start address must be a current kernel memory address.
1421b36c03eSAdrian Hunter
1431b36c03eSAdrian Hunter	<start> can also be specified by providing the name of a symbol. If the
1441b36c03eSAdrian Hunter	symbol name is not unique, it can be disambiguated by inserting #n where
1451b36c03eSAdrian Hunter	'n' selects the n'th symbol in address order. Alternately #0, #g or #G
1461b36c03eSAdrian Hunter	select only a global symbol. <size> can also be specified by providing
1471b36c03eSAdrian Hunter	the name of a symbol, in which case the size is calculated to the end
1481b36c03eSAdrian Hunter	of that symbol. For 'filter' and 'tracestop' filters, if <size> is
1491b36c03eSAdrian Hunter	omitted and <start> is a symbol, then the size is calculated to the end
1501b36c03eSAdrian Hunter	of that symbol.
1511b36c03eSAdrian Hunter
1521b36c03eSAdrian Hunter	If <size> is omitted and <start> is '*', then the start and size will
1531b36c03eSAdrian Hunter	be calculated from the first and last symbols, i.e. to trace the whole
1541b36c03eSAdrian Hunter	file.
1551b36c03eSAdrian Hunter
1561b36c03eSAdrian Hunter	If symbol names (or '*') are provided, they must be surrounded by white
1571b36c03eSAdrian Hunter	space.
1581b36c03eSAdrian Hunter
1591b36c03eSAdrian Hunter	The filter passed to the kernel is not necessarily the same as entered.
1601b36c03eSAdrian Hunter	To see the filter that is passed, use the -v option.
1611b36c03eSAdrian Hunter
1621b36c03eSAdrian Hunter	The kernel may not be able to configure a trace region if it is not
1631b36c03eSAdrian Hunter	within a single mapping.  MMAP events (or /proc/<pid>/maps) can be
1641b36c03eSAdrian Hunter	examined to determine if that is a possibility.
1651b36c03eSAdrian Hunter
1661b36c03eSAdrian Hunter	Multiple filters can be separated with space or comma.
1671b36c03eSAdrian Hunter
168d180aa56SNamhyung Kim	- bpf filters
169d180aa56SNamhyung Kim
170d180aa56SNamhyung Kim	A BPF filter can access the sample data and make a decision based on the
171d180aa56SNamhyung Kim	data.  Users need to set an appropriate sample type to use the BPF
172c46bf3bdSNamhyung Kim	filter.  BPF filters need root privilege.
173c46bf3bdSNamhyung Kim
174c46bf3bdSNamhyung Kim	The sample data field can be specified in lower case letter.  Multiple
175c46bf3bdSNamhyung Kim	filters can be separated with comma.  For example,
176c46bf3bdSNamhyung Kim
177c46bf3bdSNamhyung Kim	  --filter 'period > 1000, cpu == 1'
178c46bf3bdSNamhyung Kim	or
179c46bf3bdSNamhyung Kim	  --filter 'mem_op == load || mem_op == store, mem_lvl > l1'
180c46bf3bdSNamhyung Kim
181c46bf3bdSNamhyung Kim	The former filter only accept samples with period greater than 1000 AND
182c46bf3bdSNamhyung Kim	CPU number is 1.  The latter one accepts either load and store memory
183c46bf3bdSNamhyung Kim	operations but it should have memory level above the L1.  Since the
184c46bf3bdSNamhyung Kim	mem_op and mem_lvl fields come from the (memory) data_source, it'd only
185c46bf3bdSNamhyung Kim	work with some events which set the data_source field.
186c46bf3bdSNamhyung Kim
187c46bf3bdSNamhyung Kim	Also user should request to collect that information (with -d option in
188c46bf3bdSNamhyung Kim	the above case).  Otherwise, the following message will be shown.
189c46bf3bdSNamhyung Kim
190c46bf3bdSNamhyung Kim	  $ sudo perf record -e cycles --filter 'mem_op == load'
191c46bf3bdSNamhyung Kim	  Error: cycles event does not have PERF_SAMPLE_DATA_SRC
192c46bf3bdSNamhyung Kim	   Hint: please add -d option to perf record.
193c46bf3bdSNamhyung Kim	  failed to set filter "BPF" on event cycles with 22 (Invalid argument)
194c46bf3bdSNamhyung Kim
195c46bf3bdSNamhyung Kim	Essentially the BPF filter expression is:
196c46bf3bdSNamhyung Kim
197c46bf3bdSNamhyung Kim	  <term> <operator> <value> (("," | "||") <term> <operator> <value>)*
198c46bf3bdSNamhyung Kim
199c46bf3bdSNamhyung Kim	The <term> can be one of:
200c46bf3bdSNamhyung Kim	  ip, id, tid, pid, cpu, time, addr, period, txn, weight, phys_addr,
201c46bf3bdSNamhyung Kim	  code_pgsz, data_pgsz, weight1, weight2, weight3, ins_lat, retire_lat,
202c46bf3bdSNamhyung Kim	  p_stage_cyc, mem_op, mem_lvl, mem_snoop, mem_remote, mem_lock,
203c46bf3bdSNamhyung Kim	  mem_dtlb, mem_blk, mem_hops
204c46bf3bdSNamhyung Kim
205c46bf3bdSNamhyung Kim	The <operator> can be one of:
206c46bf3bdSNamhyung Kim	  ==, !=, >, >=, <, <=, &
207c46bf3bdSNamhyung Kim
208c46bf3bdSNamhyung Kim	The <value> can be one of:
209c46bf3bdSNamhyung Kim	  <number> (for any term)
210c46bf3bdSNamhyung Kim	  na, load, store, pfetch, exec (for mem_op)
211c46bf3bdSNamhyung Kim	  l1, l2, l3, l4, cxl, io, any_cache, lfb, ram, pmem (for mem_lvl)
212c46bf3bdSNamhyung Kim	  na, none, hit, miss, hitm, fwd, peer (for mem_snoop)
213c46bf3bdSNamhyung Kim	  remote (for mem_remote)
214c46bf3bdSNamhyung Kim	  na, locked (for mem_locked)
215c46bf3bdSNamhyung Kim	  na, l1_hit, l1_miss, l2_hit, l2_miss, any_hit, any_miss, walk, fault (for mem_dtlb)
216c46bf3bdSNamhyung Kim	  na, by_data, by_addr (for mem_blk)
217c46bf3bdSNamhyung Kim	  hops0, hops1, hops2, hops3 (for mem_hops)
218d180aa56SNamhyung Kim
2194ba1faa1SWang Nan--exclude-perf::
2204ba1faa1SWang Nan	Don't record events issued by perf itself. This option should follow
221788faab7STobias Tefke	an event selector (-e) which selects tracepoint event(s). It adds a
2224ba1faa1SWang Nan	filter expression 'common_pid != $PERFPID' to filters. If other
2234ba1faa1SWang Nan	'--filter' exists, the new filter expression will be combined with
2244ba1faa1SWang Nan	them by '&&'.
22508dbd7e3SShawn Bohrer
22686470930SIngo Molnar-a::
22708dbd7e3SShawn Bohrer--all-cpus::
228483635a9SJiri Olsa        System-wide collection from all CPUs (default if no target is specified).
22986470930SIngo Molnar
230386c0b70SArnaldo Carvalho de Melo-p::
231386c0b70SArnaldo Carvalho de Melo--pid=::
232b52956c9SDavid Ahern	Record events on existing process ID (comma separated list).
23308dbd7e3SShawn Bohrer
23408dbd7e3SShawn Bohrer-t::
23508dbd7e3SShawn Bohrer--tid=::
236b52956c9SDavid Ahern        Record events on existing thread ID (comma separated list).
23769e7e5b0SAdrian Hunter        This option also disables inheritance by default.  Enable it by adding
23869e7e5b0SAdrian Hunter        --inherit.
239386c0b70SArnaldo Carvalho de Melo
2400d37aa34SArnaldo Carvalho de Melo-u::
2410d37aa34SArnaldo Carvalho de Melo--uid=::
2420d37aa34SArnaldo Carvalho de Melo        Record events in threads owned by uid. Name or number.
2430d37aa34SArnaldo Carvalho de Melo
244386c0b70SArnaldo Carvalho de Melo-r::
245386c0b70SArnaldo Carvalho de Melo--realtime=::
246386c0b70SArnaldo Carvalho de Melo	Collect data with this RT SCHED_FIFO priority.
247563aecb2SJiri Olsa
248509051eaSArnaldo Carvalho de Melo--no-buffering::
249acac03faSKirill Smelkov	Collect data without buffering.
250386c0b70SArnaldo Carvalho de Melo
251386c0b70SArnaldo Carvalho de Melo-c::
252386c0b70SArnaldo Carvalho de Melo--count=::
253386c0b70SArnaldo Carvalho de Melo	Event period to sample.
254386c0b70SArnaldo Carvalho de Melo
255386c0b70SArnaldo Carvalho de Melo-o::
256386c0b70SArnaldo Carvalho de Melo--output=::
257386c0b70SArnaldo Carvalho de Melo	Output file name.
258386c0b70SArnaldo Carvalho de Melo
259386c0b70SArnaldo Carvalho de Melo-i::
2602e6cdf99SStephane Eranian--no-inherit::
2612e6cdf99SStephane Eranian	Child tasks do not inherit counters.
262b09c2364SArnaldo Carvalho de Melo
263386c0b70SArnaldo Carvalho de Melo-F::
264386c0b70SArnaldo Carvalho de Melo--freq=::
26567230479SArnaldo Carvalho de Melo	Profile at this frequency. Use 'max' to use the currently maximum
26667230479SArnaldo Carvalho de Melo	allowed frequency, i.e. the value in the kernel.perf_event_max_sample_rate
267b09c2364SArnaldo Carvalho de Melo	sysctl. Will throttle down to the currently maximum allowed frequency.
268b09c2364SArnaldo Carvalho de Melo	See --strict-freq.
269b09c2364SArnaldo Carvalho de Melo
270b09c2364SArnaldo Carvalho de Melo--strict-freq::
271b09c2364SArnaldo Carvalho de Melo	Fail if the specified frequency can't be used.
272386c0b70SArnaldo Carvalho de Melo
273386c0b70SArnaldo Carvalho de Melo-m::
274386c0b70SArnaldo Carvalho de Melo--mmap-pages=::
27527050f53SJiri Olsa	Number of mmap data pages (must be a power of two) or size
27627050f53SJiri Olsa	specification with appended unit character - B/K/M/G. The
27727050f53SJiri Olsa	size is rounded up to have nearest pages power of two value.
278e9db1310SAdrian Hunter	Also, by adding a comma, the number of mmap pages for AUX
279e9db1310SAdrian Hunter	area tracing can be specified.
280386c0b70SArnaldo Carvalho de Melo
281386c0b70SArnaldo Carvalho de Melo-g::
282eadcaa3dSTony Jones	Enables call-graph (stack chain/backtrace) recording for both
283eadcaa3dSTony Jones	kernel space and user space.
28409b0fd45SJiri Olsa
285386c0b70SArnaldo Carvalho de Melo--call-graph::
28609b0fd45SJiri Olsa	Setup and enable call-graph (stack chain/backtrace) recording,
287eadcaa3dSTony Jones	implies -g.  Default is "fp" (for user space).
28809b0fd45SJiri Olsa
289eadcaa3dSTony Jones	The unwinding method used for kernel space is dependent on the
290eadcaa3dSTony Jones	unwinder used by the active kernel configuration, i.e
291eadcaa3dSTony Jones	CONFIG_UNWINDER_FRAME_POINTER (fp) or CONFIG_UNWINDER_ORC (orc)
292eadcaa3dSTony Jones
293eadcaa3dSTony Jones	Any option specified here controls the method used for user space.
294eadcaa3dSTony Jones
295eadcaa3dSTony Jones	Valid options are "fp" (frame pointer), "dwarf" (DWARF's CFI -
296eadcaa3dSTony Jones	Call Frame Information) or "lbr" (Hardware Last Branch Record
297eadcaa3dSTony Jones	facility).
29809b0fd45SJiri Olsa
29909b0fd45SJiri Olsa	In some systems, where binaries are build with gcc
30009b0fd45SJiri Olsa	--fomit-frame-pointer, using the "fp" method will produce bogus
30109b0fd45SJiri Olsa	call graphs, using "dwarf", if available (perf tools linked to
30276a26549SNamhyung Kim	the libunwind or libdw library) should be used instead.
303aad2b21cSKan Liang	Using the "lbr" method doesn't require any compiler options. It
304aad2b21cSKan Liang	will produce call graphs from the hardware LBR registers. The
3051291927aSKim Phillips	main limitation is that it is only available on new Intel
306aad2b21cSKan Liang	platforms, such as Haswell. It can only get user call chain. It
307aad2b21cSKan Liang	doesn't work with branch stack sampling at the same time.
308386c0b70SArnaldo Carvalho de Melo
30976a26549SNamhyung Kim	When "dwarf" recording is used, perf also records (user) stack dump
31076a26549SNamhyung Kim	when sampled.  Default size of the stack dump is 8192 (bytes).
31176a26549SNamhyung Kim	User can change the size by passing the size after comma like
31276a26549SNamhyung Kim	"--call-graph dwarf,4096".
31376a26549SNamhyung Kim
3147cb2a53fSNamhyung Kim	When "fp" recording is used, perf tries to save stack enties
3157cb2a53fSNamhyung Kim	up to the number specified in sysctl.kernel.perf_event_max_stack
3167cb2a53fSNamhyung Kim	by default.  User can change the number by passing it after comma
3177cb2a53fSNamhyung Kim	like "--call-graph fp,32".
3187cb2a53fSNamhyung Kim
319b44308f5SArnaldo Carvalho de Melo-q::
320b44308f5SArnaldo Carvalho de Melo--quiet::
321a527c2c1SJames Clark	Don't print any warnings or messages, useful for scripting.
322b44308f5SArnaldo Carvalho de Melo
323386c0b70SArnaldo Carvalho de Melo-v::
324386c0b70SArnaldo Carvalho de Melo--verbose::
325386c0b70SArnaldo Carvalho de Melo	Be more verbose (show counter open errors, etc).
326386c0b70SArnaldo Carvalho de Melo
327386c0b70SArnaldo Carvalho de Melo-s::
328386c0b70SArnaldo Carvalho de Melo--stat::
3291f91d5fdSNamhyung Kim	Record per-thread event counts.  Use it with 'perf report -T' to see
3301f91d5fdSNamhyung Kim	the values.
331386c0b70SArnaldo Carvalho de Melo
332386c0b70SArnaldo Carvalho de Melo-d::
333386c0b70SArnaldo Carvalho de Melo--data::
3343b0a5daaSKan Liang	Record the sample virtual addresses.
3353b0a5daaSKan Liang
3363b0a5daaSKan Liang--phys-data::
3373b0a5daaSKan Liang	Record the sample physical addresses.
338386c0b70SArnaldo Carvalho de Melo
339542b88fdSKan Liang--data-page-size::
340542b88fdSKan Liang	Record the sampled data address data page size.
341542b88fdSKan Liang
342c1de7f3dSKan Liang--code-page-size::
343c1de7f3dSKan Liang	Record the sampled code address (ip) page size
344c1de7f3dSKan Liang
3459c90a61cSArnaldo Carvalho de Melo-T::
3469c90a61cSArnaldo Carvalho de Melo--timestamp::
34756100321SPeter Zijlstra	Record the sample timestamps. Use it with 'perf report -D' to see the
34856100321SPeter Zijlstra	timestamps, for instance.
34956100321SPeter Zijlstra
35056100321SPeter Zijlstra-P::
35156100321SPeter Zijlstra--period::
35256100321SPeter Zijlstra	Record the sample period.
3539c90a61cSArnaldo Carvalho de Melo
354b6f35ed7SJiri Olsa--sample-cpu::
355b6f35ed7SJiri Olsa	Record the sample cpu.
356b6f35ed7SJiri Olsa
35761110883SAdrian Hunter--sample-identifier::
35861110883SAdrian Hunter	Record the sample identifier i.e. PERF_SAMPLE_IDENTIFIER bit set in
35961110883SAdrian Hunter	the sample_type member of the struct perf_event_attr argument to the
36061110883SAdrian Hunter	perf_event_open system call.
36161110883SAdrian Hunter
362386c0b70SArnaldo Carvalho de Melo-n::
363386c0b70SArnaldo Carvalho de Melo--no-samples::
364386c0b70SArnaldo Carvalho de Melo	Don't sample.
36586470930SIngo Molnar
366ec7ba4eaSFrederic Weisbecker-R::
367ec7ba4eaSFrederic Weisbecker--raw-samples::
368bdef3b02SFrederic WeisbeckerCollect raw sample records from all opened counters (default for tracepoint counters).
369ec7ba4eaSFrederic Weisbecker
370c45c6ea2SStephane Eranian-C::
371c45c6ea2SStephane Eranian--cpu::
37208dbd7e3SShawn BohrerCollect samples only on the list of CPUs provided. Multiple CPUs can be provided as a
37308dbd7e3SShawn Bohrercomma-separated list with no space: 0,1. Ranges of CPUs are specified with -: 0-2.
374c45c6ea2SStephane EranianIn per-thread mode with inheritance mode on (default), samples are captured only when
375c45c6ea2SStephane Eranianthe thread executes on the designated CPUs. Default is to monitor all CPUs.
376c45c6ea2SStephane Eranian
3777a29c087SNamhyung Kim-B::
3787a29c087SNamhyung Kim--no-buildid::
3797a29c087SNamhyung KimDo not save the build ids of binaries in the perf.data files. This skips
3807a29c087SNamhyung Kimpost processing after recording, which sometimes makes the final step in
3817a29c087SNamhyung Kimthe recording process to take a long time, as it needs to process all
3827a29c087SNamhyung Kimevents looking for mmap records. The downside is that it can misresolve
3837a29c087SNamhyung Kimsymbols if the workload binaries used when recording get locally rebuilt
3847a29c087SNamhyung Kimor upgraded, because the only key available in this case is the
3857a29c087SNamhyung Kimpathname. You can also set the "record.build-id" config variable to
3867a29c087SNamhyung Kim'skip to have this behaviour permanently.
3877a29c087SNamhyung Kim
388a1ac1d3cSStephane Eranian-N::
389a1ac1d3cSStephane Eranian--no-buildid-cache::
39096355f2cSMasanari IidaDo not update the buildid cache. This saves some overhead in situations
391a1ac1d3cSStephane Eranianwhere the information in the perf.data file (which includes buildids)
3927a29c087SNamhyung Kimis sufficient.  You can also set the "record.build-id" config variable to
3937a29c087SNamhyung Kim'no-cache' to have the same effect.
394a1ac1d3cSStephane Eranian
395023695d9SStephane Eranian-G name,...::
396023695d9SStephane Eranian--cgroup name,...::
397023695d9SStephane Eranianmonitor only in the container (cgroup) called "name". This option is available only
398023695d9SStephane Eranianin per-cpu mode. The cgroup filesystem must be mounted. All threads belonging to
399023695d9SStephane Eraniancontainer "name" are monitored when they run on the monitored CPUs. Multiple cgroups
400023695d9SStephane Eraniancan be provided. Each cgroup is applied to the corresponding event, i.e., first cgroup
401023695d9SStephane Eranianto first event, second cgroup to second event and so on. It is possible to provide
402023695d9SStephane Eranianan empty cgroup (monitor all the time) using, e.g., -G foo,,bar. Cgroups must have
403023695d9SStephane Eraniancorresponding events, i.e., they always refer to events defined earlier on the command
40425f72f9eSweiping zhangline. If the user wants to track multiple events for a specific cgroup, the user can
40525f72f9eSweiping zhanguse '-e e1 -e e2 -G foo,foo' or just use '-e e1 -e e2 -G foo'.
40625f72f9eSweiping zhang
40725f72f9eSweiping zhangIf wanting to monitor, say, 'cycles' for a cgroup and also for system wide, this
40825f72f9eSweiping zhangcommand line can be used: 'perf stat -e cycles -G cgroup_name -a -e cycles'.
409023695d9SStephane Eranian
410bdfebd84SRoberto Agostino Vitillo-b::
411a5aabdacSStephane Eranian--branch-any::
412a5aabdacSStephane EranianEnable taken branch stack sampling. Any type of taken branch may be sampled.
413a5aabdacSStephane EranianThis is a shortcut for --branch-filter any. See --branch-filter for more infos.
414a5aabdacSStephane Eranian
415a5aabdacSStephane Eranian-j::
416a5aabdacSStephane Eranian--branch-filter::
417bdfebd84SRoberto Agostino VitilloEnable taken branch stack sampling. Each sample captures a series of consecutive
418bdfebd84SRoberto Agostino Vitillotaken branches. The number of branches captured with each sample depends on the
419bdfebd84SRoberto Agostino Vitillounderlying hardware, the type of branches of interest, and the executed code.
420bdfebd84SRoberto Agostino VitilloIt is possible to select the types of branches captured by enabling filters. The
421bdfebd84SRoberto Agostino Vitillofollowing filters are defined:
422bdfebd84SRoberto Agostino Vitillo
423bdfebd84SRoberto Agostino Vitillo        - any:  any type of branches
424bdfebd84SRoberto Agostino Vitillo        - any_call: any function call or system call
425bdfebd84SRoberto Agostino Vitillo        - any_ret: any function return or system call return
4262e49a948SAnshuman Khandual        - ind_call: any indirect branch
427955f6defSAnshuman Khandual        - ind_jmp: any indirect jump
42843e41adcSStephane Eranian        - call: direct calls, including far (to/from kernel) calls
429bdfebd84SRoberto Agostino Vitillo        - u:  only when the branch target is at the user level
430bdfebd84SRoberto Agostino Vitillo        - k: only when the branch target is in the kernel
431bdfebd84SRoberto Agostino Vitillo        - hv: only when the target is at the hypervisor level
4320126d493SAndi Kleen	- in_tx: only when the target is in a hardware transaction
4330126d493SAndi Kleen	- no_tx: only when the target is not in a hardware transaction
4340126d493SAndi Kleen	- abort_tx: only when the target is a hardware transaction abort
4353e39db4aSAnshuman Khandual	- cond: conditional branches
436955f6defSAnshuman Khandual	- call_stack: save call stack
437955f6defSAnshuman Khandual	- no_flags: don't save branch flags e.g prediction, misprediction etc
438955f6defSAnshuman Khandual	- no_cycles: don't save branch cycles
439955f6defSAnshuman Khandual	- hw_index: save branch hardware index
44060f83fa6SJin Yao	- save_type: save branch type during sampling in case binary is not available later
4413126204cSKan Liang		     For the platforms with Intel Arch LBR support (12th-Gen+ client or
4423126204cSKan Liang		     4th-Gen Xeon+ server), the save branch type is unconditionally enabled
4433126204cSKan Liang		     when the taken branch stack sampling is enabled.
444bcb96ce6SAnshuman Khandual	- priv: save privilege state during sampling in case binary is not available later
445bdfebd84SRoberto Agostino Vitillo
446bdfebd84SRoberto Agostino Vitillo+
4473e39db4aSAnshuman KhandualThe option requires at least one branch type among any, any_call, any_ret, ind_call, cond.
4489c768207SMasanari IidaThe privilege levels may be omitted, in which case, the privilege levels of the associated
449a5aabdacSStephane Eranianevent are applied to the branch filter. Both kernel (k) and hypervisor (hv) privilege
450a5aabdacSStephane Eranianlevels are subject to permissions.  When sampling on multiple events, branch stack sampling
451a5aabdacSStephane Eranianis enabled for all the sampling events. The sampled branch type is the same for all events.
452a5aabdacSStephane EranianThe various filters must be specified as a comma separated list: --branch-filter any_ret,u,k
453a5aabdacSStephane EranianNote that this feature may not be available on all processors.
454bdfebd84SRoberto Agostino Vitillo
4554173cc05SRavi Bangoria-W::
45605484298SAndi Kleen--weight::
45705484298SAndi KleenEnable weightened sampling. An additional weight is recorded per sample and can be
45805484298SAndi Kleendisplayed with the weight and local_weight sort keys.  This currently works for TSX
45905484298SAndi Kleenabort events and some memory events in precise mode on modern Intel CPUs.
46005484298SAndi Kleen
461f3b3614aSHari Bathini--namespaces::
4628fb4b679SNamhyung KimRecord events of type PERF_RECORD_NAMESPACES.  This enables 'cgroup_id' sort key.
4638fb4b679SNamhyung Kim
4648fb4b679SNamhyung Kim--all-cgroups::
4658fb4b679SNamhyung KimRecord events of type PERF_RECORD_CGROUP.  This enables 'cgroup' sort key.
466f3b3614aSHari Bathini
467475eeab9SAndi Kleen--transaction::
468475eeab9SAndi KleenRecord transaction flags for transaction related events.
469475eeab9SAndi Kleen
4703aa5939dSAdrian Hunter--per-thread::
4713aa5939dSAdrian HunterUse per-thread mmaps.  By default per-cpu mmaps are created.  This option
4723aa5939dSAdrian Hunteroverrides that and uses per-thread mmaps.  A side-effect of that is that
4733aa5939dSAdrian Hunterinheritance is automatically disabled.  --per-thread is ignored with a warning
4743aa5939dSAdrian Hunterif combined with -a or -C options.
475539e6bb7SAdrian Hunter
476a6205a35SArnaldo Carvalho de Melo-D::
477a6205a35SArnaldo Carvalho de Melo--delay=::
47868cd3b45SAlexey BudankovAfter starting the program, wait msecs before measuring (-1: start with events
4796657a099SAdrian Hunterdisabled), or enable events only for specified ranges of msecs (e.g.
4806657a099SAdrian Hunter-D 10-20,30-40 means wait 10 msecs, enable for 10 msecs, wait 10 msecs, enable
4816657a099SAdrian Hunterfor 10 msecs, then stop). Note, delaying enabling of events is useful to filter
4826657a099SAdrian Hunterout the startup phase of the program, which is often very different.
4836619a53eSAndi Kleen
4844b6c5177SStephane Eranian-I::
4854b6c5177SStephane Eranian--intr-regs::
4864b6c5177SStephane EranianCapture machine state (registers) at interrupt, i.e., on counter overflows for
4874b6c5177SStephane Eranianeach sample. List of captured registers depends on the architecture. This option
488bcc84ec6SStephane Eranianis off by default. It is possible to select the registers to sample using their
489bcc84ec6SStephane Eraniansymbolic names, e.g. on x86, ax, si. To list the available registers use
490bcc84ec6SStephane Eranian--intr-regs=\?. To name registers, pass a comma separated list such as
491bcc84ec6SStephane Eranian--intr-regs=ax,bx. The list of register is architecture dependent.
492bcc84ec6SStephane Eranian
49384c41742SAndi Kleen--user-regs::
494aeea9062SKan LiangSimilar to -I, but capture user registers at sample time. To list the available
495aeea9062SKan Lianguser registers use --user-regs=\?.
4964b6c5177SStephane Eranian
49785c273d2SAndi Kleen--running-time::
49885c273d2SAndi KleenRecord running and enabled time for read events (:S)
49985c273d2SAndi Kleen
500814c8c38SPeter Zijlstra-k::
501814c8c38SPeter Zijlstra--clockid::
502814c8c38SPeter ZijlstraSets the clock id to use for the various time fields in the perf_event_type
503814c8c38SPeter Zijlstrarecords. See clock_gettime(). In particular CLOCK_MONOTONIC and
504814c8c38SPeter ZijlstraCLOCK_MONOTONIC_RAW are supported, some events might also allow
505814c8c38SPeter ZijlstraCLOCK_BOOTTIME, CLOCK_REALTIME and CLOCK_TAI.
506814c8c38SPeter Zijlstra
5072dd6d8a1SAdrian Hunter-S::
5082dd6d8a1SAdrian Hunter--snapshot::
5092dd6d8a1SAdrian HunterSelect AUX area tracing Snapshot Mode. This option is valid only with an
510ce7b0e42SAlexander ShishkinAUX area tracing event. Optionally, certain snapshot capturing parameters
511ce7b0e42SAlexander Shishkincan be specified in a string that follows this option:
512*a1ef3aafSChangbin Du
513*a1ef3aafSChangbin Du  - 'e': take one last snapshot on exit; guarantees that there is at least one
514ce7b0e42SAlexander Shishkin       snapshot in the output file;
515*a1ef3aafSChangbin Du  - <size>: if the PMU supports this, specify the desired snapshot size.
516ce7b0e42SAlexander Shishkin
517ce7b0e42SAlexander ShishkinIn Snapshot Mode trace data is captured only when signal SIGUSR2 is received
518ce7b0e42SAlexander Shishkinand on exit if the above 'e' option is given.
5192dd6d8a1SAdrian Hunter
520c0a6de06SAdrian Hunter--aux-sample[=OPTIONS]::
521c0a6de06SAdrian HunterSelect AUX area sampling. At least one of the events selected by the -e option
522c0a6de06SAdrian Huntermust be an AUX area event. Samples on other events will be created containing
523c0a6de06SAdrian Hunterdata from the AUX area. Optionally sample size may be specified, otherwise it
524c0a6de06SAdrian Hunterdefaults to 4KiB.
525c0a6de06SAdrian Hunter
5269d9cad76SKan Liang--proc-map-timeout::
5279d9cad76SKan LiangWhen processing pre-existing threads /proc/XXX/mmap, it may take a long time,
5289d9cad76SKan Liangbecause the file may be huge. A time out is needed in such cases.
5299d9cad76SKan LiangThis option sets the time out limit. The default value is 500 ms.
5309d9cad76SKan Liang
531b757bb09SAdrian Hunter--switch-events::
532b757bb09SAdrian HunterRecord context switch events i.e. events of type PERF_RECORD_SWITCH or
533455c9882SGerman GomezPERF_RECORD_SWITCH_CPU_WIDE. In some cases (e.g. Intel PT, CoreSight or Arm SPE)
53416b4b4e1SAdrian Hunterswitch events will be enabled automatically, which can be suppressed by
53516b4b4e1SAdrian Hunterby the option --no-switch-events.
536b757bb09SAdrian Hunter
5377efe0e03SHe Kuang--vmlinux=PATH::
5387efe0e03SHe KuangSpecify vmlinux path which has debuginfo.
5397efe0e03SHe Kuang(enabled when BPF prologue is on)
54071dc2326SWang Nan
5416156681bSNamhyung Kim--buildid-all::
5426156681bSNamhyung KimRecord build-id of all DSOs regardless whether it's actually hit or not.
5436156681bSNamhyung Kim
544e29386c8SJiri Olsa--buildid-mmap::
545e29386c8SJiri OlsaRecord build ids in mmap2 events, disables build id cache (implies --no-buildid).
546e29386c8SJiri Olsa
54793f20c0fSAlexey Budankov--aio[=n]::
54893f20c0fSAlexey BudankovUse <n> control blocks in asynchronous (Posix AIO) trace writing mode (default: 1, max: 4).
549d3d1af6fSAlexey BudankovAsynchronous mode is supported only when linking Perf tool with libc library
550d3d1af6fSAlexey Budankovproviding implementation for Posix AIO API.
551d3d1af6fSAlexey Budankov
552f4fe11b7SAlexey Budankov--affinity=mode::
553f4fe11b7SAlexey BudankovSet affinity mask of trace reading thread according to the policy defined by 'mode' value:
554*a1ef3aafSChangbin Du
555*a1ef3aafSChangbin Du  - node - thread affinity mask is set to NUMA node cpu mask of the processed mmap buffer
556*a1ef3aafSChangbin Du  - cpu  - thread affinity mask is set to cpu of the processed mmap buffer
557f4fe11b7SAlexey Budankov
558470530bbSAlexey Budankov--mmap-flush=number::
559470530bbSAlexey Budankov
560470530bbSAlexey BudankovSpecify minimal number of bytes that is extracted from mmap data pages and
561470530bbSAlexey Budankovprocessed for output. One can specify the number using B/K/M/G suffixes.
562470530bbSAlexey Budankov
563470530bbSAlexey BudankovThe maximal allowed value is a quarter of the size of mmaped data pages.
564470530bbSAlexey Budankov
565470530bbSAlexey BudankovThe default option value is 1 byte which means that every time that the output
566470530bbSAlexey Budankovwriting thread finds some new data in the mmaped buffer the data is extracted,
567470530bbSAlexey Budankovpossibly compressed (-z) and written to the output, perf.data or pipe.
568470530bbSAlexey Budankov
569470530bbSAlexey BudankovLarger data chunks are compressed more effectively in comparison to smaller
570470530bbSAlexey Budankovchunks so extraction of larger chunks from the mmap data pages is preferable
571470530bbSAlexey Budankovfrom the perspective of output size reduction.
572470530bbSAlexey Budankov
573470530bbSAlexey BudankovAlso at some cases executing less output write syscalls with bigger data size
574470530bbSAlexey Budankovcan take less time than executing more output write syscalls with smaller data
575470530bbSAlexey Budankovsize thus lowering runtime profiling overhead.
576470530bbSAlexey Budankov
577504c1ad1SAlexey Budankov-z::
578504c1ad1SAlexey Budankov--compression-level[=n]::
579504c1ad1SAlexey BudankovProduce compressed trace using specified level n (default: 1 - fastest compression,
580504c1ad1SAlexey Budankov22 - smallest trace)
581504c1ad1SAlexey Budankov
58285723885SJiri Olsa--all-kernel::
58385723885SJiri OlsaConfigure all used events to run in kernel space.
58485723885SJiri Olsa
58585723885SJiri Olsa--all-user::
58685723885SJiri OlsaConfigure all used events to run in user space.
58785723885SJiri Olsa
58853651b28Syuzhoujian--kernel-callchains::
58953651b28SyuzhoujianCollect callchains only from kernel space. I.e. this option sets
59053651b28Syuzhoujianperf_event_attr.exclude_callchain_user to 1.
59153651b28Syuzhoujian
59253651b28Syuzhoujian--user-callchains::
59353651b28SyuzhoujianCollect callchains only from user space. I.e. this option sets
59453651b28Syuzhoujianperf_event_attr.exclude_callchain_kernel to 1.
59553651b28Syuzhoujian
59653651b28SyuzhoujianDon't use both --kernel-callchains and --user-callchains at the same time or no
59753651b28Syuzhoujiancallchains will be collected.
59853651b28Syuzhoujian
599eca857abSWang Nan--timestamp-filename
600eca857abSWang NanAppend timestamp to output file name.
601eca857abSWang Nan
60268588bafSJin Yao--timestamp-boundary::
60368588bafSJin YaoRecord timestamp boundary (time of first/last samples).
60468588bafSJin Yao
605dc0c6127SJiri Olsa--switch-output[=mode]::
6063c1cb7e3SWang NanGenerate multiple perf.data files, timestamp prefixed, switching to a new one
607dc0c6127SJiri Olsabased on 'mode' value:
608*a1ef3aafSChangbin Du
609*a1ef3aafSChangbin Du  - "signal" - when receiving a SIGUSR2 (default value) or
610*a1ef3aafSChangbin Du  - <size>   - when reaching the size threshold, size is expected to
611dc0c6127SJiri Olsa               be a number with appended unit character - B/K/M/G
612*a1ef3aafSChangbin Du  - <time>   - when reaching the time threshold, size is expected to
613bfacbe3bSJiri Olsa               be a number with appended unit character - s/m/h/d
614dc0c6127SJiri Olsa
615dc0c6127SJiri Olsa               Note: the precision of  the size  threshold  hugely depends
616dc0c6127SJiri Olsa               on your configuration  - the number and size of  your  ring
617dc0c6127SJiri Olsa               buffers (-m). It is generally more precise for higher sizes
618dc0c6127SJiri Olsa               (like >5M), for lower values expect different sizes.
6193c1cb7e3SWang Nan
6203c1cb7e3SWang NanA possible use case is to, given an external event, slice the perf.data file
6213c1cb7e3SWang Nanthat gets then processed, possibly via a perf script, to decide if that
6223c1cb7e3SWang Nanparticular perf.data snapshot should be kept or not.
6233c1cb7e3SWang Nan
6240c1d46a8SWang NanImplies --timestamp-filename, --no-buildid and --no-buildid-cache.
62560437ac0SJiri OlsaThe reason for the latter two is to reduce the data file switching
62660437ac0SJiri Olsaoverhead. You can still switch them on with:
62760437ac0SJiri Olsa
62860437ac0SJiri Olsa  --switch-output --no-no-buildid  --no-no-buildid-cache
629eca857abSWang Nan
630899e5ffbSArnaldo Carvalho de Melo--switch-output-event::
631899e5ffbSArnaldo Carvalho de MeloEvents that will cause the switch of the perf.data file, auto-selecting
632899e5ffbSArnaldo Carvalho de Melo--switch-output=signal, the results are similar as internally the side band
633899e5ffbSArnaldo Carvalho de Melothread will also send a SIGUSR2 to the main one.
634899e5ffbSArnaldo Carvalho de Melo
635899e5ffbSArnaldo Carvalho de MeloUses the same syntax as --event, it will just not be recorded, serving only to
636899e5ffbSArnaldo Carvalho de Meloswitch the perf.data file as soon as the --switch-output event is processed by
637899e5ffbSArnaldo Carvalho de Meloa separate sideband thread.
638899e5ffbSArnaldo Carvalho de Melo
639899e5ffbSArnaldo Carvalho de MeloThis sideband thread is also used to other purposes, like processing the
640899e5ffbSArnaldo Carvalho de MeloPERF_RECORD_BPF_EVENT records as they happen, asking the kernel for extra BPF
641899e5ffbSArnaldo Carvalho de Meloinformation, etc.
642899e5ffbSArnaldo Carvalho de Melo
64303724b2eSAndi Kleen--switch-max-files=N::
64403724b2eSAndi Kleen
64503724b2eSAndi KleenWhen rotating perf.data with --switch-output, only keep N files.
64603724b2eSAndi Kleen
6470aab2136SWang Nan--dry-run::
6480aab2136SWang NanParse options then exit. --dry-run can be used to detect errors in cmdline
6490aab2136SWang Nanoptions.
6500aab2136SWang Nan
6510aab2136SWang Nan'perf record --dry-run -e' can act as a BPF script compiler if llvm.dump-obj
6520aab2136SWang Nanin config file is set to true.
6530aab2136SWang Nan
65441b740b6SNamhyung Kim--synth=TYPE::
65541b740b6SNamhyung KimCollect and synthesize given type of events (comma separated).  Note that
65641b740b6SNamhyung Kimthis option controls the synthesis from the /proc filesystem which represent
65741b740b6SNamhyung Kimtask status for pre-existing threads.
65841b740b6SNamhyung Kim
65941b740b6SNamhyung KimKernel (and some other) events are recorded regardless of the
66041b740b6SNamhyung Kimchoice in this option.  For example, --synth=no would have MMAP events for
66141b740b6SNamhyung Kimkernel and modules.
66241b740b6SNamhyung Kim
66341b740b6SNamhyung KimAvailable types are:
664*a1ef3aafSChangbin Du
665*a1ef3aafSChangbin Du  - 'task'    - synthesize FORK and COMM events for each task
666*a1ef3aafSChangbin Du  - 'mmap'    - synthesize MMAP events for each process (implies 'task')
667*a1ef3aafSChangbin Du  - 'cgroup'  - synthesize CGROUP events for each cgroup
668*a1ef3aafSChangbin Du  - 'all'     - synthesize all events (default)
669*a1ef3aafSChangbin Du  - 'no'      - do not synthesize any of the above events
67041b740b6SNamhyung Kim
6714ea648aeSWang Nan--tail-synthesize::
6724ea648aeSWang NanInstead of collecting non-sample events (for example, fork, comm, mmap) at
6734ea648aeSWang Nanthe beginning of record, collect them during finalizing an output file.
6744ea648aeSWang NanThe collected non-sample events reflects the status of the system when
6754ea648aeSWang Nanrecord is finished.
6764ea648aeSWang Nan
677626a6b78SWang Nan--overwrite::
678626a6b78SWang NanMakes all events use an overwritable ring buffer. An overwritable ring
679626a6b78SWang Nanbuffer works like a flight recorder: when it gets full, the kernel will
680626a6b78SWang Nanoverwrite the oldest records, that thus will never make it to the
681626a6b78SWang Nanperf.data file.
682626a6b78SWang Nan
683626a6b78SWang NanWhen '--overwrite' and '--switch-output' are used perf records and drops
684626a6b78SWang Nanevents until it receives a signal, meaning that something unusual was
685626a6b78SWang Nandetected that warrants taking a snapshot of the most current events,
686626a6b78SWang Nanthose fitting in the ring buffer at that moment.
687626a6b78SWang Nan
688626a6b78SWang Nan'overwrite' attribute can also be set or canceled for an event using
689626a6b78SWang Nanconfig terms. For example: 'cycles/overwrite/' and 'instructions/no-overwrite/'.
690626a6b78SWang Nan
6914ea648aeSWang NanImplies --tail-synthesize.
6924ea648aeSWang Nan
693eeb399b5SAdrian Hunter--kcore::
694eeb399b5SAdrian HunterMake a copy of /proc/kcore and place it into a directory with the perf data file.
695eeb399b5SAdrian Hunter
6966d575816SJiwei Sun--max-size=<size>::
6976d575816SJiwei SunLimit the sample data max size, <size> is expected to be a number with
6986d575816SJiwei Sunappended unit character - B/K/M/G
6996d575816SJiwei Sun
700d99c22eaSStephane Eranian--num-thread-synthesize::
701d99c22eaSStephane Eranian	The number of threads to run when synthesizing events for existing processes.
702d99c22eaSStephane Eranian	By default, the number of threads equals 1.
703d99c22eaSStephane Eranian
70470943490SStephane Eranianifdef::HAVE_LIBPFM[]
70570943490SStephane Eranian--pfm-events events::
70670943490SStephane EranianSelect a PMU event using libpfm4 syntax (see http://perfmon2.sf.net)
70770943490SStephane Eranianincluding support for event filters. For example '--pfm-events
70870943490SStephane Eranianinst_retired:any_p:u:c=1:i'. More than one event can be passed to the
70970943490SStephane Eranianoption using the comma separator. Hardware events and generic hardware
71070943490SStephane Eranianevents cannot be mixed together. The latter must be used with the -e
71170943490SStephane Eranianoption. The -e option and this one can be mixed and matched.  Events
71270943490SStephane Eraniancan be grouped using the {} notation.
71370943490SStephane Eranianendif::HAVE_LIBPFM[]
71470943490SStephane Eranian
715a8fcbd26SAdrian Hunter--control=fifo:ctl-fifo[,ack-fifo]::
7161f4390d8SAdrian Hunter--control=fd:ctl-fd[,ack-fd]::
717a8fcbd26SAdrian Hunterctl-fifo / ack-fifo are opened and used as ctl-fd / ack-fd as follows.
718feca8a83SJiri OlsaListen on ctl-fd descriptor for command to control measurement.
719feca8a83SJiri Olsa
720feca8a83SJiri OlsaAvailable commands:
721142544a9SJiri Olsa
722*a1ef3aafSChangbin Du  - 'enable'           : enable events
723*a1ef3aafSChangbin Du  - 'disable'          : disable events
724*a1ef3aafSChangbin Du  - 'enable name'      : enable event 'name'
725*a1ef3aafSChangbin Du  - 'disable name'     : disable event 'name'
726*a1ef3aafSChangbin Du  - 'snapshot'         : AUX area tracing snapshot).
727*a1ef3aafSChangbin Du  - 'stop'             : stop perf record
728*a1ef3aafSChangbin Du  - 'ping'             : ping
729*a1ef3aafSChangbin Du  - 'evlist [-v|-g|-F] : display all events
730*a1ef3aafSChangbin Du
731142544a9SJiri Olsa                         -F  Show just the sample frequency used for each event.
732142544a9SJiri Olsa                         -v  Show all fields.
733142544a9SJiri Olsa                         -g  Show event group information.
734feca8a83SJiri Olsa
735feca8a83SJiri OlsaMeasurements can be started with events disabled using --delay=-1 option. Optionally
736feca8a83SJiri Olsasend control command completion ('ack\n') to ack-fd descriptor to synchronize with the
737feca8a83SJiri Olsacontrolling process.  Example of bash shell script to enable and disable events during
738feca8a83SJiri Olsameasurements:
7391d078ccbSAlexey Budankov
7401d078ccbSAlexey Budankov #!/bin/bash
7411d078ccbSAlexey Budankov
7421d078ccbSAlexey Budankov ctl_dir=/tmp/
7431d078ccbSAlexey Budankov
7441d078ccbSAlexey Budankov ctl_fifo=${ctl_dir}perf_ctl.fifo
7451d078ccbSAlexey Budankov test -p ${ctl_fifo} && unlink ${ctl_fifo}
7461d078ccbSAlexey Budankov mkfifo ${ctl_fifo}
7471d078ccbSAlexey Budankov exec {ctl_fd}<>${ctl_fifo}
7481d078ccbSAlexey Budankov
7491d078ccbSAlexey Budankov ctl_ack_fifo=${ctl_dir}perf_ctl_ack.fifo
7501d078ccbSAlexey Budankov test -p ${ctl_ack_fifo} && unlink ${ctl_ack_fifo}
7511d078ccbSAlexey Budankov mkfifo ${ctl_ack_fifo}
7521d078ccbSAlexey Budankov exec {ctl_fd_ack}<>${ctl_ack_fifo}
7531d078ccbSAlexey Budankov
7541d078ccbSAlexey Budankov perf record -D -1 -e cpu-cycles -a               \
7551d078ccbSAlexey Budankov             --control fd:${ctl_fd},${ctl_fd_ack} \
7561d078ccbSAlexey Budankov             -- sleep 30 &
7571d078ccbSAlexey Budankov perf_pid=$!
7581d078ccbSAlexey Budankov
7591d078ccbSAlexey Budankov sleep 5  && echo 'enable' >&${ctl_fd} && read -u ${ctl_fd_ack} e1 && echo "enabled(${e1})"
7601d078ccbSAlexey Budankov sleep 10 && echo 'disable' >&${ctl_fd} && read -u ${ctl_fd_ack} d1 && echo "disabled(${d1})"
7611d078ccbSAlexey Budankov
7621d078ccbSAlexey Budankov exec {ctl_fd_ack}>&-
7631d078ccbSAlexey Budankov unlink ${ctl_ack_fifo}
7641d078ccbSAlexey Budankov
7651d078ccbSAlexey Budankov exec {ctl_fd}>&-
7661d078ccbSAlexey Budankov unlink ${ctl_fifo}
7671d078ccbSAlexey Budankov
7681d078ccbSAlexey Budankov wait -n ${perf_pid}
7691d078ccbSAlexey Budankov exit $?
7701d078ccbSAlexey Budankov
771f466e5edSAlexey Bayduraev--threads=<spec>::
77206380a84SAlexey BayduraevWrite collected trace data into several data files using parallel threads.
773f466e5edSAlexey Bayduraev<spec> value can be user defined list of masks. Masks separated by colon
774f466e5edSAlexey Bayduraevdefine CPUs to be monitored by a thread and affinity mask of that thread
775f466e5edSAlexey Bayduraevis separated by slash:
776f466e5edSAlexey Bayduraev
777f466e5edSAlexey Bayduraev    <cpus mask 1>/<affinity mask 1>:<cpus mask 2>/<affinity mask 2>:...
778f466e5edSAlexey Bayduraev
779f466e5edSAlexey BayduraevCPUs or affinity masks must not overlap with other corresponding masks.
780f466e5edSAlexey BayduraevInvalid CPUs are ignored, but masks containing only invalid CPUs are not
781f466e5edSAlexey Bayduraevallowed.
782f466e5edSAlexey Bayduraev
783f466e5edSAlexey BayduraevFor example user specification like the following:
784f466e5edSAlexey Bayduraev
785f466e5edSAlexey Bayduraev    0,2-4/2-4:1,5-7/5-7
786f466e5edSAlexey Bayduraev
787f466e5edSAlexey Bayduraevspecifies parallel threads layout that consists of two threads,
788f466e5edSAlexey Bayduraevthe first thread monitors CPUs 0 and 2-4 with the affinity mask 2-4,
789f466e5edSAlexey Bayduraevthe second monitors CPUs 1 and 5-7 with the affinity mask 5-7.
790f466e5edSAlexey Bayduraev
791f466e5edSAlexey Bayduraev<spec> value can also be a string meaning predefined parallel threads
792f466e5edSAlexey Bayduraevlayout:
793f466e5edSAlexey Bayduraev
794*a1ef3aafSChangbin Du    - cpu    - create new data streaming thread for every monitored cpu
795*a1ef3aafSChangbin Du    - core   - create new thread to monitor CPUs grouped by a core
796*a1ef3aafSChangbin Du    - package - create new thread to monitor CPUs grouped by a package
797*a1ef3aafSChangbin Du    - numa   - create new threed to monitor CPUs grouped by a NUMA domain
798f466e5edSAlexey Bayduraev
799f466e5edSAlexey BayduraevPredefined layouts can be used on systems with large number of CPUs in
800f466e5edSAlexey Bayduraevorder not to spawn multiple per-cpu streaming threads but still avoid LOST
801f466e5edSAlexey Bayduraevevents in data directory files. Option specified with no or empty value
802f466e5edSAlexey Bayduraevdefaults to CPU layout. Masks defined or provided by the option value are
803f466e5edSAlexey Bayduraevfiltered through the mask provided by -C option.
80406380a84SAlexey Bayduraev
8059bce13eaSJiri Olsa--debuginfod[=URLs]::
8069bce13eaSJiri Olsa	Specify debuginfod URL to be used when cacheing perf.data binaries,
8079bce13eaSJiri Olsa	it follows the same syntax as the DEBUGINFOD_URLS variable, like:
8089bce13eaSJiri Olsa
8099bce13eaSJiri Olsa	  http://192.168.122.174:8002
8109bce13eaSJiri Olsa
8119bce13eaSJiri Olsa	If the URLs is not specified, the value of DEBUGINFOD_URLS
8129bce13eaSJiri Olsa	system environment variable is used.
8139bce13eaSJiri Olsa
814edc41a10SNamhyung Kim--off-cpu::
815edc41a10SNamhyung Kim	Enable off-cpu profiling with BPF.  The BPF program will collect
816edc41a10SNamhyung Kim	task scheduling information with (user) stacktrace and save them
817edc41a10SNamhyung Kim	as sample data of a software event named "offcpu-time".  The
818edc41a10SNamhyung Kim	sample period will have the time the task slept in nanoseconds.
819edc41a10SNamhyung Kim
820edc41a10SNamhyung Kim	Note that BPF can collect stack traces using frame pointer ("fp")
821edc41a10SNamhyung Kim	only, as of now.  So the applications built without the frame
822edc41a10SNamhyung Kim	pointer might see bogus addresses.
823edc41a10SNamhyung Kim
824e89eaa61SAndi Kleeninclude::intel-hybrid.txt[]
825e89eaa61SAndi Kleen
82686470930SIngo MolnarSEE ALSO
82786470930SIngo Molnar--------
828870d325bSAdrian Hunterlinkperf:perf-stat[1], linkperf:perf-list[1], linkperf:perf-intel-pt[1]
829