1
2		------------------------------
3		****** perf by examples ******
4		------------------------------
5
6[ From an e-mail by Ingo Molnar, http://lkml.org/lkml/2009/8/4/346 ]
7
8
9First, discovery/enumeration of available counters can be done via
10'perf list':
11
12titan:~> perf list
13  [...]
14  kmem:kmalloc                             [Tracepoint event]
15  kmem:kmem_cache_alloc                    [Tracepoint event]
16  kmem:kmalloc_node                        [Tracepoint event]
17  kmem:kmem_cache_alloc_node               [Tracepoint event]
18  kmem:kfree                               [Tracepoint event]
19  kmem:kmem_cache_free                     [Tracepoint event]
20  kmem:mm_page_free                        [Tracepoint event]
21  kmem:mm_page_free_batched                [Tracepoint event]
22  kmem:mm_page_alloc                       [Tracepoint event]
23  kmem:mm_page_alloc_zone_locked           [Tracepoint event]
24  kmem:mm_page_pcpu_drain                  [Tracepoint event]
25  kmem:mm_page_alloc_extfrag               [Tracepoint event]
26
27Then any (or all) of the above event sources can be activated and
28measured. For example the page alloc/free properties of a 'hackbench
29run' are:
30
31 titan:~> perf stat -e kmem:mm_page_pcpu_drain -e kmem:mm_page_alloc
32 -e kmem:mm_page_free_batched -e kmem:mm_page_free ./hackbench 10
33 Time: 0.575
34
35 Performance counter stats for './hackbench 10':
36
37          13857  kmem:mm_page_pcpu_drain
38          27576  kmem:mm_page_alloc
39           6025  kmem:mm_page_free_batched
40          20934  kmem:mm_page_free
41
42    0.613972165  seconds time elapsed
43
44You can observe the statistical properties as well, by using the
45'repeat the workload N times' feature of perf stat:
46
47 titan:~> perf stat --repeat 5 -e kmem:mm_page_pcpu_drain -e
48   kmem:mm_page_alloc -e kmem:mm_page_free_batched -e
49   kmem:mm_page_free ./hackbench 10
50 Time: 0.627
51 Time: 0.644
52 Time: 0.564
53 Time: 0.559
54 Time: 0.626
55
56 Performance counter stats for './hackbench 10' (5 runs):
57
58          12920  kmem:mm_page_pcpu_drain    ( +-   3.359% )
59          25035  kmem:mm_page_alloc         ( +-   3.783% )
60           6104  kmem:mm_page_free_batched  ( +-   0.934% )
61          18376  kmem:mm_page_free	    ( +-   4.941% )
62
63    0.643954516  seconds time elapsed   ( +-   2.363% )
64
65Furthermore, these tracepoints can be used to sample the workload as
66well. For example the page allocations done by a 'git gc' can be
67captured the following way:
68
69 titan:~/git> perf record -e kmem:mm_page_alloc -c 1 ./git gc
70 Counting objects: 1148, done.
71 Delta compression using up to 2 threads.
72 Compressing objects: 100% (450/450), done.
73 Writing objects: 100% (1148/1148), done.
74 Total 1148 (delta 690), reused 1148 (delta 690)
75 [ perf record: Captured and wrote 0.267 MB perf.data (~11679 samples) ]
76
77To check which functions generated page allocations:
78
79 titan:~/git> perf report
80 # Samples: 10646
81 #
82 # Overhead          Command               Shared Object
83 # ........  ...............  ..........................
84 #
85    23.57%       git-repack  /lib64/libc-2.5.so
86    21.81%              git  /lib64/libc-2.5.so
87    14.59%              git  ./git
88    11.79%       git-repack  ./git
89     7.12%              git  /lib64/ld-2.5.so
90     3.16%       git-repack  /lib64/libpthread-2.5.so
91     2.09%       git-repack  /bin/bash
92     1.97%               rm  /lib64/libc-2.5.so
93     1.39%               mv  /lib64/ld-2.5.so
94     1.37%               mv  /lib64/libc-2.5.so
95     1.12%       git-repack  /lib64/ld-2.5.so
96     0.95%               rm  /lib64/ld-2.5.so
97     0.90%  git-update-serv  /lib64/libc-2.5.so
98     0.73%  git-update-serv  /lib64/ld-2.5.so
99     0.68%             perf  /lib64/libpthread-2.5.so
100     0.64%       git-repack  /usr/lib64/libz.so.1.2.3
101
102Or to see it on a more finegrained level:
103
104titan:~/git> perf report --sort comm,dso,symbol
105# Samples: 10646
106#
107# Overhead          Command               Shared Object  Symbol
108# ........  ...............  ..........................  ......
109#
110     9.35%       git-repack  ./git                       [.] insert_obj_hash
111     9.12%              git  ./git                       [.] insert_obj_hash
112     7.31%              git  /lib64/libc-2.5.so          [.] memcpy
113     6.34%       git-repack  /lib64/libc-2.5.so          [.] _int_malloc
114     6.24%       git-repack  /lib64/libc-2.5.so          [.] memcpy
115     5.82%       git-repack  /lib64/libc-2.5.so          [.] __GI___fork
116     5.47%              git  /lib64/libc-2.5.so          [.] _int_malloc
117     2.99%              git  /lib64/libc-2.5.so          [.] memset
118
119Furthermore, call-graph sampling can be done too, of page
120allocations - to see precisely what kind of page allocations there
121are:
122
123 titan:~/git> perf record -g -e kmem:mm_page_alloc -c 1 ./git gc
124 Counting objects: 1148, done.
125 Delta compression using up to 2 threads.
126 Compressing objects: 100% (450/450), done.
127 Writing objects: 100% (1148/1148), done.
128 Total 1148 (delta 690), reused 1148 (delta 690)
129 [ perf record: Captured and wrote 0.963 MB perf.data (~42069 samples) ]
130
131 titan:~/git> perf report -g
132 # Samples: 10686
133 #
134 # Overhead          Command               Shared Object
135 # ........  ...............  ..........................
136 #
137    23.25%       git-repack  /lib64/libc-2.5.so
138                |
139                |--50.00%-- _int_free
140                |
141                |--37.50%-- __GI___fork
142                |          make_child
143                |
144                |--12.50%-- ptmalloc_unlock_all2
145                |          make_child
146                |
147                 --6.25%-- __GI_strcpy
148    21.61%              git  /lib64/libc-2.5.so
149                |
150                |--30.00%-- __GI_read
151                |          |
152                |           --83.33%-- git_config_from_file
153                |                     git_config
154                |                     |
155   [...]
156
157Or you can observe the whole system's page allocations for 10
158seconds:
159
160titan:~/git> perf stat -a -e kmem:mm_page_pcpu_drain -e
161kmem:mm_page_alloc -e kmem:mm_page_free_batched -e
162kmem:mm_page_free sleep 10
163
164 Performance counter stats for 'sleep 10':
165
166         171585  kmem:mm_page_pcpu_drain
167         322114  kmem:mm_page_alloc
168          73623  kmem:mm_page_free_batched
169         254115  kmem:mm_page_free
170
171   10.000591410  seconds time elapsed
172
173Or observe how fluctuating the page allocations are, via statistical
174analysis done over ten 1-second intervals:
175
176 titan:~/git> perf stat --repeat 10 -a -e kmem:mm_page_pcpu_drain -e
177   kmem:mm_page_alloc -e kmem:mm_page_free_batched -e
178   kmem:mm_page_free sleep 1
179
180 Performance counter stats for 'sleep 1' (10 runs):
181
182          17254  kmem:mm_page_pcpu_drain    ( +-   3.709% )
183          34394  kmem:mm_page_alloc         ( +-   4.617% )
184           7509  kmem:mm_page_free_batched  ( +-   4.820% )
185          25653  kmem:mm_page_free	    ( +-   3.672% )
186
187    1.058135029  seconds time elapsed   ( +-   3.089% )
188
189Or you can annotate the recorded 'git gc' run on a per symbol basis
190and check which instructions/source-code generated page allocations:
191
192 titan:~/git> perf annotate __GI___fork
193 ------------------------------------------------
194  Percent |      Source code & Disassembly of libc-2.5.so
195 ------------------------------------------------
196          :
197          :
198          :      Disassembly of section .plt:
199          :      Disassembly of section .text:
200          :
201          :      00000031a2e95560 <__fork>:
202 [...]
203     0.00 :        31a2e95602:   b8 38 00 00 00          mov    $0x38,%eax
204     0.00 :        31a2e95607:   0f 05                   syscall
205    83.42 :        31a2e95609:   48 3d 00 f0 ff ff       cmp    $0xfffffffffffff000,%rax
206     0.00 :        31a2e9560f:   0f 87 4d 01 00 00       ja     31a2e95762 <__fork+0x202>
207     0.00 :        31a2e95615:   85 c0                   test   %eax,%eax
208
209( this shows that 83.42% of __GI___fork's page allocations come from
210  the 0x38 system call it performs. )
211
212etc. etc. - a lot more is possible. I could list a dozen of
213other different usecases straight away - neither of which is
214possible via /proc/vmstat.
215
216/proc/vmstat is not in the same league really, in terms of
217expressive power of system analysis and performance
218analysis.
219
220All that the above results needed were those new tracepoints
221in include/tracing/events/kmem.h.
222
223	Ingo
224
225
226