xref: /openbmc/linux/tools/perf/Documentation/perf-bench.txt (revision 2584e54502e1c77ce143d5874520f36240395e6f)
1perf-bench(1)
2=============
3
4NAME
5----
6perf-bench - General framework for benchmark suites
7
8SYNOPSIS
9--------
10[verse]
11'perf bench' [<common options>] <subsystem> <suite> [<options>]
12
13DESCRIPTION
14-----------
15This 'perf bench' command is a general framework for benchmark suites.
16
17COMMON OPTIONS
18--------------
19-r::
20--repeat=::
21Specify number of times to repeat the run (default 10).
22
23-f::
24--format=::
25Specify format style.
26Current available format styles are:
27
28'default'::
29Default style. This is mainly for human reading.
30---------------------
31% perf bench sched pipe                      # with no style specified
32(executing 1000000 pipe operations between two tasks)
33        Total time:5.855 sec
34                5.855061 usecs/op
35		170792 ops/sec
36---------------------
37
38'simple'::
39This simple style is friendly for automated
40processing by scripts.
41---------------------
42% perf bench --format=simple sched pipe      # specified simple
435.988
44---------------------
45
46SUBSYSTEM
47---------
48
49'sched'::
50	Scheduler and IPC mechanisms.
51
52'syscall'::
53	System call performance (throughput).
54
55'mem'::
56	Memory access performance.
57
58'numa'::
59	NUMA scheduling and MM benchmarks.
60
61'futex'::
62	Futex stressing benchmarks.
63
64'epoll'::
65	Eventpoll (epoll) stressing benchmarks.
66
67'internals'::
68	Benchmark internal perf functionality.
69
70'uprobe'::
71	Benchmark overhead of uprobe + BPF.
72
73'all'::
74	All benchmark subsystems.
75
76SUITES FOR 'sched'
77~~~~~~~~~~~~~~~~~~
78*messaging*::
79Suite for evaluating performance of scheduler and IPC mechanisms.
80Based on hackbench by Rusty Russell.
81
82Options of *messaging*
83^^^^^^^^^^^^^^^^^^^^^^
84-p::
85--pipe::
86Use pipe() instead of socketpair()
87
88-t::
89--thread::
90Be multi thread instead of multi process
91
92-g::
93--group=::
94Specify number of groups
95
96-l::
97--nr_loops=::
98Specify number of loops
99
100Example of *messaging*
101^^^^^^^^^^^^^^^^^^^^^^
102
103---------------------
104% perf bench sched messaging                 # run with default
105options (20 sender and receiver processes per group)
106(10 groups == 400 processes run)
107
108      Total time:0.308 sec
109
110% perf bench sched messaging -t -g 20        # be multi-thread, with 20 groups
111(20 sender and receiver threads per group)
112(20 groups == 800 threads run)
113
114      Total time:0.582 sec
115---------------------
116
117*pipe*::
118Suite for pipe() system call.
119Based on pipe-test-1m.c by Ingo Molnar.
120
121Options of *pipe*
122^^^^^^^^^^^^^^^^^
123-l::
124--loop=::
125Specify number of loops.
126
127Example of *pipe*
128^^^^^^^^^^^^^^^^^
129
130---------------------
131% perf bench sched pipe
132(executing 1000000 pipe operations between two tasks)
133
134        Total time:8.091 sec
135                8.091833 usecs/op
136                123581 ops/sec
137
138% perf bench sched pipe -l 1000              # loop 1000
139(executing 1000 pipe operations between two tasks)
140
141        Total time:0.016 sec
142                16.948000 usecs/op
143                59004 ops/sec
144---------------------
145
146SUITES FOR 'syscall'
147~~~~~~~~~~~~~~~~~~
148*basic*::
149Suite for evaluating performance of core system call throughput (both usecs/op and ops/sec metrics).
150This uses a single thread simply doing getppid(2), which is a simple syscall where the result is not
151cached by glibc.
152
153
154SUITES FOR 'mem'
155~~~~~~~~~~~~~~~~
156*memcpy*::
157Suite for evaluating performance of simple memory copy in various ways.
158
159Options of *memcpy*
160^^^^^^^^^^^^^^^^^^^
161-l::
162--size::
163Specify size of memory to copy (default: 1MB).
164Available units are B, KB, MB, GB and TB (case insensitive).
165
166-f::
167--function::
168Specify function to copy (default: default).
169Available functions are depend on the architecture.
170On x86-64, x86-64-unrolled, x86-64-movsq and x86-64-movsb are supported.
171
172-l::
173--nr_loops::
174Repeat memcpy invocation this number of times.
175
176-c::
177--cycles::
178Use perf's cpu-cycles event instead of gettimeofday syscall.
179
180*memset*::
181Suite for evaluating performance of simple memory set in various ways.
182
183Options of *memset*
184^^^^^^^^^^^^^^^^^^^
185-l::
186--size::
187Specify size of memory to set (default: 1MB).
188Available units are B, KB, MB, GB and TB (case insensitive).
189
190-f::
191--function::
192Specify function to set (default: default).
193Available functions are depend on the architecture.
194On x86-64, x86-64-unrolled, x86-64-stosq and x86-64-stosb are supported.
195
196-l::
197--nr_loops::
198Repeat memset invocation this number of times.
199
200-c::
201--cycles::
202Use perf's cpu-cycles event instead of gettimeofday syscall.
203
204SUITES FOR 'numa'
205~~~~~~~~~~~~~~~~~
206*mem*::
207Suite for evaluating NUMA workloads.
208
209SUITES FOR 'futex'
210~~~~~~~~~~~~~~~~~~
211*hash*::
212Suite for evaluating hash tables.
213
214*wake*::
215Suite for evaluating wake calls.
216
217*wake-parallel*::
218Suite for evaluating parallel wake calls.
219
220*requeue*::
221Suite for evaluating requeue calls.
222
223*lock-pi*::
224Suite for evaluating futex lock_pi calls.
225
226SUITES FOR 'epoll'
227~~~~~~~~~~~~~~~~~~
228*wait*::
229Suite for evaluating concurrent epoll_wait calls.
230
231*ctl*::
232Suite for evaluating multiple epoll_ctl calls.
233
234SUITES FOR 'internals'
235~~~~~~~~~~~~~~~~~~~~~~
236*synthesize*::
237Suite for evaluating perf's event synthesis performance.
238
239SEE ALSO
240--------
241linkperf:perf[1]
242