1================
2Delay accounting
3================
4
5Tasks encounter delays in execution when they wait
6for some kernel resource to become available e.g. a
7runnable task may wait for a free CPU to run on.
8
9The per-task delay accounting functionality measures
10the delays experienced by a task while
11
12a) waiting for a CPU (while being runnable)
13b) completion of synchronous block I/O initiated by the task
14c) swapping in pages
15d) memory reclaim
16e) thrashing page cache
17f) direct compact
18g) write-protect copy
19
20and makes these statistics available to userspace through
21the taskstats interface.
22
23Such delays provide feedback for setting a task's cpu priority,
24io priority and rss limit values appropriately. Long delays for
25important tasks could be a trigger for raising its corresponding priority.
26
27The functionality, through its use of the taskstats interface, also provides
28delay statistics aggregated for all tasks (or threads) belonging to a
29thread group (corresponding to a traditional Unix process). This is a commonly
30needed aggregation that is more efficiently done by the kernel.
31
32Userspace utilities, particularly resource management applications, can also
33aggregate delay statistics into arbitrary groups. To enable this, delay
34statistics of a task are available both during its lifetime as well as on its
35exit, ensuring continuous and complete monitoring can be done.
36
37
38Interface
39---------
40
41Delay accounting uses the taskstats interface which is described
42in detail in a separate document in this directory. Taskstats returns a
43generic data structure to userspace corresponding to per-pid and per-tgid
44statistics. The delay accounting functionality populates specific fields of
45this structure. See
46
47     include/uapi/linux/taskstats.h
48
49for a description of the fields pertaining to delay accounting.
50It will generally be in the form of counters returning the cumulative
51delay seen for cpu, sync block I/O, swapin, memory reclaim, thrash page
52cache, direct compact, write-protect copy etc.
53
54Taking the difference of two successive readings of a given
55counter (say cpu_delay_total) for a task will give the delay
56experienced by the task waiting for the corresponding resource
57in that interval.
58
59When a task exits, records containing the per-task statistics
60are sent to userspace without requiring a command. If it is the last exiting
61task of a thread group, the per-tgid statistics are also sent. More details
62are given in the taskstats interface description.
63
64The getdelays.c userspace utility in tools/accounting directory allows simple
65commands to be run and the corresponding delay statistics to be displayed. It
66also serves as an example of using the taskstats interface.
67
68Usage
69-----
70
71Compile the kernel with::
72
73	CONFIG_TASK_DELAY_ACCT=y
74	CONFIG_TASKSTATS=y
75
76Delay accounting is disabled by default at boot up.
77To enable, add::
78
79   delayacct
80
81to the kernel boot options. The rest of the instructions below assume this has
82been done. Alternatively, use sysctl kernel.task_delayacct to switch the state
83at runtime. Note however that only tasks started after enabling it will have
84delayacct information.
85
86After the system has booted up, use a utility
87similar to  getdelays.c to access the delays
88seen by a given task or a task group (tgid).
89The utility also allows a given command to be
90executed and the corresponding delays to be
91seen.
92
93General format of the getdelays command::
94
95	getdelays [-dilv] [-t tgid] [-p pid]
96
97Get delays, since system boot, for pid 10::
98
99	# ./getdelays -d -p 10
100	(output similar to next case)
101
102Get sum of delays, since system boot, for all pids with tgid 5::
103
104	# ./getdelays -d -t 5
105	print delayacct stats ON
106	TGID	5
107
108
109	CPU             count     real total  virtual total    delay total  delay average
110	                    8        7000000        6872122        3382277          0.423ms
111	IO              count    delay total  delay average
112		            0              0              0ms
113	SWAP            count    delay total  delay average
114	                    0              0              0ms
115	RECLAIM         count    delay total  delay average
116		            0              0              0ms
117	THRASHING       count    delay total  delay average
118	                    0              0              0ms
119	COMPACT         count    delay total  delay average
120	                    0              0              0ms
121        WPCOPY          count    delay total  delay average
122                            0              0              0ms
123
124Get IO accounting for pid 1, it works only with -p::
125
126	# ./getdelays -i -p 1
127	printing IO accounting
128	linuxrc: read=65536, write=0, cancelled_write=0
129
130The above command can be used with -v to get more debug information.
131