1*b7cb8405SShuah Khan.. SPDX-License-Identifier: (GPL-2.0+ OR CC-BY-4.0) 2*b7cb8405SShuah Khan 3*b7cb8405SShuah Khan====================================================== 4*b7cb8405SShuah KhanDiscovering Linux kernel subsystems used by a workload 5*b7cb8405SShuah Khan====================================================== 6*b7cb8405SShuah Khan 7*b7cb8405SShuah Khan:Authors: - Shuah Khan <skhan@linuxfoundation.org> 8*b7cb8405SShuah Khan - Shefali Sharma <sshefali021@gmail.com> 9*b7cb8405SShuah Khan:maintained-by: Shuah Khan <skhan@linuxfoundation.org> 10*b7cb8405SShuah Khan 11*b7cb8405SShuah KhanKey Points 12*b7cb8405SShuah Khan========== 13*b7cb8405SShuah Khan 14*b7cb8405SShuah Khan * Understanding system resources necessary to build and run a workload 15*b7cb8405SShuah Khan is important. 16*b7cb8405SShuah Khan * Linux tracing and strace can be used to discover the system resources 17*b7cb8405SShuah Khan in use by a workload. The completeness of the system usage information 18*b7cb8405SShuah Khan depends on the completeness of coverage of a workload. 19*b7cb8405SShuah Khan * Performance and security of the operating system can be analyzed with 20*b7cb8405SShuah Khan the help of tools such as: 21*b7cb8405SShuah Khan `perf <https://man7.org/linux/man-pages/man1/perf.1.html>`_, 22*b7cb8405SShuah Khan `stress-ng <https://www.mankier.com/1/stress-ng>`_, 23*b7cb8405SShuah Khan `paxtest <https://github.com/opntr/paxtest-freebsd>`_. 24*b7cb8405SShuah Khan * Once we discover and understand the workload needs, we can focus on them 25*b7cb8405SShuah Khan to avoid regressions and use it to evaluate safety considerations. 26*b7cb8405SShuah Khan 27*b7cb8405SShuah KhanMethodology 28*b7cb8405SShuah Khan=========== 29*b7cb8405SShuah Khan 30*b7cb8405SShuah Khan`strace <https://man7.org/linux/man-pages/man1/strace.1.html>`_ is a 31*b7cb8405SShuah Khandiagnostic, instructional, and debugging tool and can be used to discover 32*b7cb8405SShuah Khanthe system resources in use by a workload. Once we discover and understand 33*b7cb8405SShuah Khanthe workload needs, we can focus on them to avoid regressions and use it 34*b7cb8405SShuah Khanto evaluate safety considerations. We use strace tool to trace workloads. 35*b7cb8405SShuah Khan 36*b7cb8405SShuah KhanThis method of tracing using strace tells us the system calls invoked by 37*b7cb8405SShuah Khanthe workload and doesn't include all the system calls that can be invoked 38*b7cb8405SShuah Khanby it. In addition, this tracing method tells us just the code paths within 39*b7cb8405SShuah Khanthese system calls that are invoked. As an example, if a workload opens a 40*b7cb8405SShuah Khanfile and reads from it successfully, then the success path is the one that 41*b7cb8405SShuah Khanis traced. Any error paths in that system call will not be traced. If there 42*b7cb8405SShuah Khanis a workload that provides full coverage of a workload then the method 43*b7cb8405SShuah Khanoutlined here will trace and find all possible code paths. The completeness 44*b7cb8405SShuah Khanof the system usage information depends on the completeness of coverage of a 45*b7cb8405SShuah Khanworkload. 46*b7cb8405SShuah Khan 47*b7cb8405SShuah KhanThe goal is tracing a workload on a system running a default kernel without 48*b7cb8405SShuah Khanrequiring custom kernel installs. 49*b7cb8405SShuah Khan 50*b7cb8405SShuah KhanHow do we gather fine-grained system information? 51*b7cb8405SShuah Khan================================================= 52*b7cb8405SShuah Khan 53*b7cb8405SShuah Khanstrace tool can be used to trace system calls made by a process and signals 54*b7cb8405SShuah Khanit receives. System calls are the fundamental interface between an 55*b7cb8405SShuah Khanapplication and the operating system kernel. They enable a program to 56*b7cb8405SShuah Khanrequest services from the kernel. For instance, the open() system call in 57*b7cb8405SShuah KhanLinux is used to provide access to a file in the file system. strace enables 58*b7cb8405SShuah Khanus to track all the system calls made by an application. It lists all the 59*b7cb8405SShuah Khansystem calls made by a process and their resulting output. 60*b7cb8405SShuah Khan 61*b7cb8405SShuah KhanYou can generate profiling data combining strace and perf record tools to 62*b7cb8405SShuah Khanrecord the events and information associated with a process. This provides 63*b7cb8405SShuah Khaninsight into the process. "perf annotate" tool generates the statistics of 64*b7cb8405SShuah Khaneach instruction of the program. This document goes over the details of how 65*b7cb8405SShuah Khanto gather fine-grained information on a workload's usage of system resources. 66*b7cb8405SShuah Khan 67*b7cb8405SShuah KhanWe used strace to trace the perf, stress-ng, paxtest workloads to illustrate 68*b7cb8405SShuah Khanour methodology to discover resources used by a workload. This process can 69*b7cb8405SShuah Khanbe applied to trace other workloads. 70*b7cb8405SShuah Khan 71*b7cb8405SShuah KhanGetting the system ready for tracing 72*b7cb8405SShuah Khan==================================== 73*b7cb8405SShuah Khan 74*b7cb8405SShuah KhanBefore we can get started we will show you how to get your system ready. 75*b7cb8405SShuah KhanWe assume that you have a Linux distribution running on a physical system 76*b7cb8405SShuah Khanor a virtual machine. Most distributions will include strace command. Let’s 77*b7cb8405SShuah Khaninstall other tools that aren’t usually included to build Linux kernel. 78*b7cb8405SShuah KhanPlease note that the following works on Debian based distributions. You 79*b7cb8405SShuah Khanmight have to find equivalent packages on other Linux distributions. 80*b7cb8405SShuah Khan 81*b7cb8405SShuah KhanInstall tools to build Linux kernel and tools in kernel repository. 82*b7cb8405SShuah Khanscripts/ver_linux is a good way to check if your system already has 83*b7cb8405SShuah Khanthe necessary tools:: 84*b7cb8405SShuah Khan 85*b7cb8405SShuah Khan sudo apt-get build-essentials flex bison yacc 86*b7cb8405SShuah Khan sudo apt install libelf-dev systemtap-sdt-dev libaudit-dev libslang2-dev libperl-dev libdw-dev 87*b7cb8405SShuah Khan 88*b7cb8405SShuah Khancscope is a good tool to browse kernel sources. Let's install it now:: 89*b7cb8405SShuah Khan 90*b7cb8405SShuah Khan sudo apt-get install cscope 91*b7cb8405SShuah Khan 92*b7cb8405SShuah KhanInstall stress-ng and paxtest:: 93*b7cb8405SShuah Khan 94*b7cb8405SShuah Khan apt-get install stress-ng 95*b7cb8405SShuah Khan apt-get install paxtest 96*b7cb8405SShuah Khan 97*b7cb8405SShuah KhanWorkload overview 98*b7cb8405SShuah Khan================= 99*b7cb8405SShuah Khan 100*b7cb8405SShuah KhanAs mentioned earlier, we used strace to trace perf bench, stress-ng and 101*b7cb8405SShuah Khanpaxtest workloads to show how to analyze a workload and identify Linux 102*b7cb8405SShuah Khansubsystems used by these workloads. Let's start with an overview of these 103*b7cb8405SShuah Khanthree workloads to get a better understanding of what they do and how to 104*b7cb8405SShuah Khanuse them. 105*b7cb8405SShuah Khan 106*b7cb8405SShuah Khanperf bench (all) workload 107*b7cb8405SShuah Khan------------------------- 108*b7cb8405SShuah Khan 109*b7cb8405SShuah KhanThe perf bench command contains multiple multi-threaded microkernel 110*b7cb8405SShuah Khanbenchmarks for executing different subsystems in the Linux kernel and 111*b7cb8405SShuah Khansystem calls. This allows us to easily measure the impact of changes, 112*b7cb8405SShuah Khanwhich can help mitigate performance regressions. It also acts as a common 113*b7cb8405SShuah Khanbenchmarking framework, enabling developers to easily create test cases, 114*b7cb8405SShuah Khanintegrate transparently, and use performance-rich tooling subsystems. 115*b7cb8405SShuah Khan 116*b7cb8405SShuah KhanStress-ng netdev stressor workload 117*b7cb8405SShuah Khan---------------------------------- 118*b7cb8405SShuah Khan 119*b7cb8405SShuah Khanstress-ng is used for performing stress testing on the kernel. It allows 120*b7cb8405SShuah Khanyou to exercise various physical subsystems of the computer, as well as 121*b7cb8405SShuah Khaninterfaces of the OS kernel, using "stressor-s". They are available for 122*b7cb8405SShuah KhanCPU, CPU cache, devices, I/O, interrupts, file system, memory, network, 123*b7cb8405SShuah Khanoperating system, pipelines, schedulers, and virtual machines. Please refer 124*b7cb8405SShuah Khanto the `stress-ng man-page <https://www.mankier.com/1/stress-ng>`_ to 125*b7cb8405SShuah Khanfind the description of all the available stressor-s. The netdev stressor 126*b7cb8405SShuah Khanstarts specified number (N) of workers that exercise various netdevice 127*b7cb8405SShuah Khanioctl commands across all the available network devices. 128*b7cb8405SShuah Khan 129*b7cb8405SShuah Khanpaxtest kiddie workload 130*b7cb8405SShuah Khan----------------------- 131*b7cb8405SShuah Khan 132*b7cb8405SShuah Khanpaxtest is a program that tests buffer overflows in the kernel. It tests 133*b7cb8405SShuah Khankernel enforcements over memory usage. Generally, execution in some memory 134*b7cb8405SShuah Khansegments makes buffer overflows possible. It runs a set of programs that 135*b7cb8405SShuah Khanattempt to subvert memory usage. It is used as a regression test suite for 136*b7cb8405SShuah KhanPaX, but might be useful to test other memory protection patches for the 137*b7cb8405SShuah Khankernel. We used paxtest kiddie mode which looks for simple vulnerabilities. 138*b7cb8405SShuah Khan 139*b7cb8405SShuah KhanWhat is strace and how do we use it? 140*b7cb8405SShuah Khan==================================== 141*b7cb8405SShuah Khan 142*b7cb8405SShuah KhanAs mentioned earlier, strace which is a useful diagnostic, instructional, 143*b7cb8405SShuah Khanand debugging tool and can be used to discover the system resources in use 144*b7cb8405SShuah Khanby a workload. It can be used: 145*b7cb8405SShuah Khan 146*b7cb8405SShuah Khan * To see how a process interacts with the kernel. 147*b7cb8405SShuah Khan * To see why a process is failing or hanging. 148*b7cb8405SShuah Khan * For reverse engineering a process. 149*b7cb8405SShuah Khan * To find the files on which a program depends. 150*b7cb8405SShuah Khan * For analyzing the performance of an application. 151*b7cb8405SShuah Khan * For troubleshooting various problems related to the operating system. 152*b7cb8405SShuah Khan 153*b7cb8405SShuah KhanIn addition, strace can generate run-time statistics on times, calls, and 154*b7cb8405SShuah Khanerrors for each system call and report a summary when program exits, 155*b7cb8405SShuah Khansuppressing the regular output. This attempts to show system time (CPU time 156*b7cb8405SShuah Khanspent running in the kernel) independent of wall clock time. We plan to use 157*b7cb8405SShuah Khanthese features to get information on workload system usage. 158*b7cb8405SShuah Khan 159*b7cb8405SShuah Khanstrace command supports basic, verbose, and stats modes. strace command when 160*b7cb8405SShuah Khanrun in verbose mode gives more detailed information about the system calls 161*b7cb8405SShuah Khaninvoked by a process. 162*b7cb8405SShuah Khan 163*b7cb8405SShuah KhanRunning strace -c generates a report of the percentage of time spent in each 164*b7cb8405SShuah Khansystem call, the total time in seconds, the microseconds per call, the total 165*b7cb8405SShuah Khannumber of calls, the count of each system call that has failed with an error 166*b7cb8405SShuah Khanand the type of system call made. 167*b7cb8405SShuah Khan 168*b7cb8405SShuah Khan * Usage: strace <command we want to trace> 169*b7cb8405SShuah Khan * Verbose mode usage: strace -v <command> 170*b7cb8405SShuah Khan * Gather statistics: strace -c <command> 171*b7cb8405SShuah Khan 172*b7cb8405SShuah KhanWe used the “-c” option to gather fine-grained run-time statistics in use 173*b7cb8405SShuah Khanby three workloads we have chose for this analysis. 174*b7cb8405SShuah Khan 175*b7cb8405SShuah Khan * perf 176*b7cb8405SShuah Khan * stress-ng 177*b7cb8405SShuah Khan * paxtest 178*b7cb8405SShuah Khan 179*b7cb8405SShuah KhanWhat is cscope and how do we use it? 180*b7cb8405SShuah Khan==================================== 181*b7cb8405SShuah Khan 182*b7cb8405SShuah KhanNow let’s look at `cscope <https://cscope.sourceforge.net/>`_, a command 183*b7cb8405SShuah Khanline tool for browsing C, C++ or Java code-bases. We can use it to find 184*b7cb8405SShuah Khanall the references to a symbol, global definitions, functions called by a 185*b7cb8405SShuah Khanfunction, functions calling a function, text strings, regular expression 186*b7cb8405SShuah Khanpatterns, files including a file. 187*b7cb8405SShuah Khan 188*b7cb8405SShuah KhanWe can use cscope to find which system call belongs to which subsystem. 189*b7cb8405SShuah KhanThis way we can find the kernel subsystems used by a process when it is 190*b7cb8405SShuah Khanexecuted. 191*b7cb8405SShuah Khan 192*b7cb8405SShuah KhanLet’s checkout the latest Linux repository and build cscope database:: 193*b7cb8405SShuah Khan 194*b7cb8405SShuah Khan git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git linux 195*b7cb8405SShuah Khan cd linux 196*b7cb8405SShuah Khan cscope -R -p10 # builds cscope.out database before starting browse session 197*b7cb8405SShuah Khan cscope -d -p10 # starts browse session on cscope.out database 198*b7cb8405SShuah Khan 199*b7cb8405SShuah KhanNote: Run "cscope -R -p10" to build the database and c"scope -d -p10" to 200*b7cb8405SShuah Khanenter into the browsing session. cscope by default cscope.out database. 201*b7cb8405SShuah KhanTo get out of this mode press ctrl+d. -p option is used to specify the 202*b7cb8405SShuah Khannumber of file path components to display. -p10 is optimal for browsing 203*b7cb8405SShuah Khankernel sources. 204*b7cb8405SShuah Khan 205*b7cb8405SShuah KhanWhat is perf and how do we use it? 206*b7cb8405SShuah Khan================================== 207*b7cb8405SShuah Khan 208*b7cb8405SShuah KhanPerf is an analysis tool based on Linux 2.6+ systems, which abstracts the 209*b7cb8405SShuah KhanCPU hardware difference in performance measurement in Linux, and provides 210*b7cb8405SShuah Khana simple command line interface. Perf is based on the perf_events interface 211*b7cb8405SShuah Khanexported by the kernel. It is very useful for profiling the system and 212*b7cb8405SShuah Khanfinding performance bottlenecks in an application. 213*b7cb8405SShuah Khan 214*b7cb8405SShuah KhanIf you haven't already checked out the Linux mainline repository, you can do 215*b7cb8405SShuah Khanso and then build kernel and perf tool:: 216*b7cb8405SShuah Khan 217*b7cb8405SShuah Khan git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git linux 218*b7cb8405SShuah Khan cd linux 219*b7cb8405SShuah Khan make -j3 all 220*b7cb8405SShuah Khan cd tools/perf 221*b7cb8405SShuah Khan make 222*b7cb8405SShuah Khan 223*b7cb8405SShuah KhanNote: The perf command can be built without building the kernel in the 224*b7cb8405SShuah Khanrepository and can be run on older kernels. However matching the kernel 225*b7cb8405SShuah Khanand perf revisions gives more accurate information on the subsystem usage. 226*b7cb8405SShuah Khan 227*b7cb8405SShuah KhanWe used "perf stat" and "perf bench" options. For a detailed information on 228*b7cb8405SShuah Khanthe perf tool, run "perf -h". 229*b7cb8405SShuah Khan 230*b7cb8405SShuah Khanperf stat 231*b7cb8405SShuah Khan--------- 232*b7cb8405SShuah KhanThe perf stat command generates a report of various hardware and software 233*b7cb8405SShuah Khanevents. It does so with the help of hardware counter registers found in 234*b7cb8405SShuah Khanmodern CPUs that keep the count of these activities. "perf stat cal" shows 235*b7cb8405SShuah Khanstats for cal command. 236*b7cb8405SShuah Khan 237*b7cb8405SShuah KhanPerf bench 238*b7cb8405SShuah Khan---------- 239*b7cb8405SShuah KhanThe perf bench command contains multiple multi-threaded microkernel 240*b7cb8405SShuah Khanbenchmarks for executing different subsystems in the Linux kernel and 241*b7cb8405SShuah Khansystem calls. This allows us to easily measure the impact of changes, 242*b7cb8405SShuah Khanwhich can help mitigate performance regressions. It also acts as a common 243*b7cb8405SShuah Khanbenchmarking framework, enabling developers to easily create test cases, 244*b7cb8405SShuah Khanintegrate transparently, and use performance-rich tooling. 245*b7cb8405SShuah Khan 246*b7cb8405SShuah Khan"perf bench all" command runs the following benchmarks: 247*b7cb8405SShuah Khan 248*b7cb8405SShuah Khan * sched/messaging 249*b7cb8405SShuah Khan * sched/pipe 250*b7cb8405SShuah Khan * syscall/basic 251*b7cb8405SShuah Khan * mem/memcpy 252*b7cb8405SShuah Khan * mem/memset 253*b7cb8405SShuah Khan 254*b7cb8405SShuah KhanWhat is stress-ng and how do we use it? 255*b7cb8405SShuah Khan======================================= 256*b7cb8405SShuah Khan 257*b7cb8405SShuah KhanAs mentioned earlier, stress-ng is used for performing stress testing on 258*b7cb8405SShuah Khanthe kernel. It allows you to exercise various physical subsystems of the 259*b7cb8405SShuah Khancomputer, as well as interfaces of the OS kernel, using stressor-s. They 260*b7cb8405SShuah Khanare available for CPU, CPU cache, devices, I/O, interrupts, file system, 261*b7cb8405SShuah Khanmemory, network, operating system, pipelines, schedulers, and virtual 262*b7cb8405SShuah Khanmachines. 263*b7cb8405SShuah Khan 264*b7cb8405SShuah KhanThe netdev stressor starts N workers that exercise various netdevice ioctl 265*b7cb8405SShuah Khancommands across all the available network devices. The following ioctls are 266*b7cb8405SShuah Khanexercised: 267*b7cb8405SShuah Khan 268*b7cb8405SShuah Khan * SIOCGIFCONF, SIOCGIFINDEX, SIOCGIFNAME, SIOCGIFFLAGS 269*b7cb8405SShuah Khan * SIOCGIFADDR, SIOCGIFNETMASK, SIOCGIFMETRIC, SIOCGIFMTU 270*b7cb8405SShuah Khan * SIOCGIFHWADDR, SIOCGIFMAP, SIOCGIFTXQLEN 271*b7cb8405SShuah Khan 272*b7cb8405SShuah KhanThe following command runs the stressor:: 273*b7cb8405SShuah Khan 274*b7cb8405SShuah Khan stress-ng --netdev 1 -t 60 --metrics command. 275*b7cb8405SShuah Khan 276*b7cb8405SShuah KhanWe can use the perf record command to record the events and information 277*b7cb8405SShuah Khanassociated with a process. This command records the profiling data in the 278*b7cb8405SShuah Khanperf.data file in the same directory. 279*b7cb8405SShuah Khan 280*b7cb8405SShuah KhanUsing the following commands you can record the events associated with the 281*b7cb8405SShuah Khannetdev stressor, view the generated report perf.data and annotate the to 282*b7cb8405SShuah Khanview the statistics of each instruction of the program:: 283*b7cb8405SShuah Khan 284*b7cb8405SShuah Khan perf record stress-ng --netdev 1 -t 60 --metrics command. 285*b7cb8405SShuah Khan perf report 286*b7cb8405SShuah Khan perf annotate 287*b7cb8405SShuah Khan 288*b7cb8405SShuah KhanWhat is paxtest and how do we use it? 289*b7cb8405SShuah Khan===================================== 290*b7cb8405SShuah Khan 291*b7cb8405SShuah Khanpaxtest is a program that tests buffer overflows in the kernel. It tests 292*b7cb8405SShuah Khankernel enforcements over memory usage. Generally, execution in some memory 293*b7cb8405SShuah Khansegments makes buffer overflows possible. It runs a set of programs that 294*b7cb8405SShuah Khanattempt to subvert memory usage. It is used as a regression test suite for 295*b7cb8405SShuah KhanPaX, and will be useful to test other memory protection patches for the 296*b7cb8405SShuah Khankernel. 297*b7cb8405SShuah Khan 298*b7cb8405SShuah Khanpaxtest provides kiddie and blackhat modes. The paxtest kiddie mode runs 299*b7cb8405SShuah Khanin normal mode, whereas the blackhat mode tries to get around the protection 300*b7cb8405SShuah Khanof the kernel testing for vulnerabilities. We focus on the kiddie mode here 301*b7cb8405SShuah Khanand combine "paxtest kiddie" run with "perf record" to collect CPU stack 302*b7cb8405SShuah Khantraces for the paxtest kiddie run to see which function is calling other 303*b7cb8405SShuah Khanfunctions in the performance profile. Then the "dwarf" (DWARF's Call Frame 304*b7cb8405SShuah KhanInformation) mode can be used to unwind the stack. 305*b7cb8405SShuah Khan 306*b7cb8405SShuah KhanThe following command can be used to view resulting report in call-graph 307*b7cb8405SShuah Khanformat:: 308*b7cb8405SShuah Khan 309*b7cb8405SShuah Khan perf record --call-graph dwarf paxtest kiddie 310*b7cb8405SShuah Khan perf report --stdio 311*b7cb8405SShuah Khan 312*b7cb8405SShuah KhanTracing workloads 313*b7cb8405SShuah Khan================= 314*b7cb8405SShuah Khan 315*b7cb8405SShuah KhanNow that we understand the workloads, let's start tracing them. 316*b7cb8405SShuah Khan 317*b7cb8405SShuah KhanTracing perf bench all workload 318*b7cb8405SShuah Khan------------------------------- 319*b7cb8405SShuah Khan 320*b7cb8405SShuah KhanRun the following command to trace perf bench all workload:: 321*b7cb8405SShuah Khan 322*b7cb8405SShuah Khan strace -c perf bench all 323*b7cb8405SShuah Khan 324*b7cb8405SShuah Khan**System Calls made by the workload** 325*b7cb8405SShuah Khan 326*b7cb8405SShuah KhanThe below table shows the system calls invoked by the workload, number of 327*b7cb8405SShuah Khantimes each system call is invoked, and the corresponding Linux subsystem. 328*b7cb8405SShuah Khan 329*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 330*b7cb8405SShuah Khan| System Call | # calls | Linux Subsystem | System Call (API) | 331*b7cb8405SShuah Khan+===================+===========+=================+=========================+ 332*b7cb8405SShuah Khan| getppid | 10000001 | Process Mgmt | sys_getpid() | 333*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 334*b7cb8405SShuah Khan| clone | 1077 | Process Mgmt. | sys_clone() | 335*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 336*b7cb8405SShuah Khan| prctl | 23 | Process Mgmt. | sys_prctl() | 337*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 338*b7cb8405SShuah Khan| prlimit64 | 7 | Process Mgmt. | sys_prlimit64() | 339*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 340*b7cb8405SShuah Khan| getpid | 10 | Process Mgmt. | sys_getpid() | 341*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 342*b7cb8405SShuah Khan| uname | 3 | Process Mgmt. | sys_uname() | 343*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 344*b7cb8405SShuah Khan| sysinfo | 1 | Process Mgmt. | sys_sysinfo() | 345*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 346*b7cb8405SShuah Khan| getuid | 1 | Process Mgmt. | sys_getuid() | 347*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 348*b7cb8405SShuah Khan| getgid | 1 | Process Mgmt. | sys_getgid() | 349*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 350*b7cb8405SShuah Khan| geteuid | 1 | Process Mgmt. | sys_geteuid() | 351*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 352*b7cb8405SShuah Khan| getegid | 1 | Process Mgmt. | sys_getegid | 353*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 354*b7cb8405SShuah Khan| close | 49951 | Filesystem | sys_close() | 355*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 356*b7cb8405SShuah Khan| pipe | 604 | Filesystem | sys_pipe() | 357*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 358*b7cb8405SShuah Khan| openat | 48560 | Filesystem | sys_opennat() | 359*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 360*b7cb8405SShuah Khan| fstat | 8338 | Filesystem | sys_fstat() | 361*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 362*b7cb8405SShuah Khan| stat | 1573 | Filesystem | sys_stat() | 363*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 364*b7cb8405SShuah Khan| pread64 | 9646 | Filesystem | sys_pread64() | 365*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 366*b7cb8405SShuah Khan| getdents64 | 1873 | Filesystem | sys_getdents64() | 367*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 368*b7cb8405SShuah Khan| access | 3 | Filesystem | sys_access() | 369*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 370*b7cb8405SShuah Khan| lstat | 1880 | Filesystem | sys_lstat() | 371*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 372*b7cb8405SShuah Khan| lseek | 6 | Filesystem | sys_lseek() | 373*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 374*b7cb8405SShuah Khan| ioctl | 3 | Filesystem | sys_ioctl() | 375*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 376*b7cb8405SShuah Khan| dup2 | 1 | Filesystem | sys_dup2() | 377*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 378*b7cb8405SShuah Khan| execve | 2 | Filesystem | sys_execve() | 379*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 380*b7cb8405SShuah Khan| fcntl | 8779 | Filesystem | sys_fcntl() | 381*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 382*b7cb8405SShuah Khan| statfs | 1 | Filesystem | sys_statfs() | 383*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 384*b7cb8405SShuah Khan| epoll_create | 2 | Filesystem | sys_epoll_create() | 385*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 386*b7cb8405SShuah Khan| epoll_ctl | 64 | Filesystem | sys_epoll_ctl() | 387*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 388*b7cb8405SShuah Khan| newfstatat | 8318 | Filesystem | sys_newfstatat() | 389*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 390*b7cb8405SShuah Khan| eventfd2 | 192 | Filesystem | sys_eventfd2() | 391*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 392*b7cb8405SShuah Khan| mmap | 243 | Memory Mgmt. | sys_mmap() | 393*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 394*b7cb8405SShuah Khan| mprotect | 32 | Memory Mgmt. | sys_mprotect() | 395*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 396*b7cb8405SShuah Khan| brk | 21 | Memory Mgmt. | sys_brk() | 397*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 398*b7cb8405SShuah Khan| munmap | 128 | Memory Mgmt. | sys_munmap() | 399*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 400*b7cb8405SShuah Khan| set_mempolicy | 156 | Memory Mgmt. | sys_set_mempolicy() | 401*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 402*b7cb8405SShuah Khan| set_tid_address | 1 | Process Mgmt. | sys_set_tid_address() | 403*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 404*b7cb8405SShuah Khan| set_robust_list | 1 | Futex | sys_set_robust_list() | 405*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 406*b7cb8405SShuah Khan| futex | 341 | Futex | sys_futex() | 407*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 408*b7cb8405SShuah Khan| sched_getaffinity | 79 | Scheduler | sys_sched_getaffinity() | 409*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 410*b7cb8405SShuah Khan| sched_setaffinity | 223 | Scheduler | sys_sched_setaffinity() | 411*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 412*b7cb8405SShuah Khan| socketpair | 202 | Network | sys_socketpair() | 413*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 414*b7cb8405SShuah Khan| rt_sigprocmask | 21 | Signal | sys_rt_sigprocmask() | 415*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 416*b7cb8405SShuah Khan| rt_sigaction | 36 | Signal | sys_rt_sigaction() | 417*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 418*b7cb8405SShuah Khan| rt_sigreturn | 2 | Signal | sys_rt_sigreturn() | 419*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 420*b7cb8405SShuah Khan| wait4 | 889 | Time | sys_wait4() | 421*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 422*b7cb8405SShuah Khan| clock_nanosleep | 37 | Time | sys_clock_nanosleep() | 423*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 424*b7cb8405SShuah Khan| capget | 4 | Capability | sys_capget() | 425*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 426*b7cb8405SShuah Khan 427*b7cb8405SShuah KhanTracing stress-ng netdev stressor workload 428*b7cb8405SShuah Khan------------------------------------------ 429*b7cb8405SShuah Khan 430*b7cb8405SShuah KhanRun the following command to trace stress-ng netdev stressor workload:: 431*b7cb8405SShuah Khan 432*b7cb8405SShuah Khan strace -c stress-ng --netdev 1 -t 60 --metrics 433*b7cb8405SShuah Khan 434*b7cb8405SShuah Khan**System Calls made by the workload** 435*b7cb8405SShuah Khan 436*b7cb8405SShuah KhanThe below table shows the system calls invoked by the workload, number of 437*b7cb8405SShuah Khantimes each system call is invoked, and the corresponding Linux subsystem. 438*b7cb8405SShuah Khan 439*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 440*b7cb8405SShuah Khan| System Call | # calls | Linux Subsystem | System Call (API) | 441*b7cb8405SShuah Khan+===================+===========+=================+=========================+ 442*b7cb8405SShuah Khan| openat | 74 | Filesystem | sys_openat() | 443*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 444*b7cb8405SShuah Khan| close | 75 | Filesystem | sys_close() | 445*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 446*b7cb8405SShuah Khan| read | 58 | Filesystem | sys_read() | 447*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 448*b7cb8405SShuah Khan| fstat | 20 | Filesystem | sys_fstat() | 449*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 450*b7cb8405SShuah Khan| flock | 10 | Filesystem | sys_flock() | 451*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 452*b7cb8405SShuah Khan| write | 7 | Filesystem | sys_write() | 453*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 454*b7cb8405SShuah Khan| getdents64 | 8 | Filesystem | sys_getdents64() | 455*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 456*b7cb8405SShuah Khan| pread64 | 8 | Filesystem | sys_pread64() | 457*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 458*b7cb8405SShuah Khan| lseek | 1 | Filesystem | sys_lseek() | 459*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 460*b7cb8405SShuah Khan| access | 2 | Filesystem | sys_access() | 461*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 462*b7cb8405SShuah Khan| getcwd | 1 | Filesystem | sys_getcwd() | 463*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 464*b7cb8405SShuah Khan| execve | 1 | Filesystem | sys_execve() | 465*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 466*b7cb8405SShuah Khan| mmap | 61 | Memory Mgmt. | sys_mmap() | 467*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 468*b7cb8405SShuah Khan| munmap | 3 | Memory Mgmt. | sys_munmap() | 469*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 470*b7cb8405SShuah Khan| mprotect | 20 | Memory Mgmt. | sys_mprotect() | 471*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 472*b7cb8405SShuah Khan| mlock | 2 | Memory Mgmt. | sys_mlock() | 473*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 474*b7cb8405SShuah Khan| brk | 3 | Memory Mgmt. | sys_brk() | 475*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 476*b7cb8405SShuah Khan| rt_sigaction | 21 | Signal | sys_rt_sigaction() | 477*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 478*b7cb8405SShuah Khan| rt_sigprocmask | 1 | Signal | sys_rt_sigprocmask() | 479*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 480*b7cb8405SShuah Khan| sigaltstack | 1 | Signal | sys_sigaltstack() | 481*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 482*b7cb8405SShuah Khan| rt_sigreturn | 1 | Signal | sys_rt_sigreturn() | 483*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 484*b7cb8405SShuah Khan| getpid | 8 | Process Mgmt. | sys_getpid() | 485*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 486*b7cb8405SShuah Khan| prlimit64 | 5 | Process Mgmt. | sys_prlimit64() | 487*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 488*b7cb8405SShuah Khan| arch_prctl | 2 | Process Mgmt. | sys_arch_prctl() | 489*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 490*b7cb8405SShuah Khan| sysinfo | 2 | Process Mgmt. | sys_sysinfo() | 491*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 492*b7cb8405SShuah Khan| getuid | 2 | Process Mgmt. | sys_getuid() | 493*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 494*b7cb8405SShuah Khan| uname | 1 | Process Mgmt. | sys_uname() | 495*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 496*b7cb8405SShuah Khan| setpgid | 1 | Process Mgmt. | sys_setpgid() | 497*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 498*b7cb8405SShuah Khan| getrusage | 1 | Process Mgmt. | sys_getrusage() | 499*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 500*b7cb8405SShuah Khan| geteuid | 1 | Process Mgmt. | sys_geteuid() | 501*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 502*b7cb8405SShuah Khan| getppid | 1 | Process Mgmt. | sys_getppid() | 503*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 504*b7cb8405SShuah Khan| sendto | 3 | Network | sys_sendto() | 505*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 506*b7cb8405SShuah Khan| connect | 1 | Network | sys_connect() | 507*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 508*b7cb8405SShuah Khan| socket | 1 | Network | sys_socket() | 509*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 510*b7cb8405SShuah Khan| clone | 1 | Process Mgmt. | sys_clone() | 511*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 512*b7cb8405SShuah Khan| set_tid_address | 1 | Process Mgmt. | sys_set_tid_address() | 513*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 514*b7cb8405SShuah Khan| wait4 | 2 | Time | sys_wait4() | 515*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 516*b7cb8405SShuah Khan| alarm | 1 | Time | sys_alarm() | 517*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 518*b7cb8405SShuah Khan| set_robust_list | 1 | Futex | sys_set_robust_list() | 519*b7cb8405SShuah Khan+-------------------+-----------+-----------------+-------------------------+ 520*b7cb8405SShuah Khan 521*b7cb8405SShuah KhanTracing paxtest kiddie workload 522*b7cb8405SShuah Khan------------------------------- 523*b7cb8405SShuah Khan 524*b7cb8405SShuah KhanRun the following command to trace paxtest kiddie workload:: 525*b7cb8405SShuah Khan 526*b7cb8405SShuah Khan strace -c paxtest kiddie 527*b7cb8405SShuah Khan 528*b7cb8405SShuah Khan**System Calls made by the workload** 529*b7cb8405SShuah Khan 530*b7cb8405SShuah KhanThe below table shows the system calls invoked by the workload, number of 531*b7cb8405SShuah Khantimes each system call is invoked, and the corresponding Linux subsystem. 532*b7cb8405SShuah Khan 533*b7cb8405SShuah Khan+-------------------+-----------+-----------------+----------------------+ 534*b7cb8405SShuah Khan| System Call | # calls | Linux Subsystem | System Call (API) | 535*b7cb8405SShuah Khan+===================+===========+=================+======================+ 536*b7cb8405SShuah Khan| read | 3 | Filesystem | sys_read() | 537*b7cb8405SShuah Khan+-------------------+-----------+-----------------+----------------------+ 538*b7cb8405SShuah Khan| write | 11 | Filesystem | sys_write() | 539*b7cb8405SShuah Khan+-------------------+-----------+-----------------+----------------------+ 540*b7cb8405SShuah Khan| close | 41 | Filesystem | sys_close() | 541*b7cb8405SShuah Khan+-------------------+-----------+-----------------+----------------------+ 542*b7cb8405SShuah Khan| stat | 24 | Filesystem | sys_stat() | 543*b7cb8405SShuah Khan+-------------------+-----------+-----------------+----------------------+ 544*b7cb8405SShuah Khan| fstat | 2 | Filesystem | sys_fstat() | 545*b7cb8405SShuah Khan+-------------------+-----------+-----------------+----------------------+ 546*b7cb8405SShuah Khan| pread64 | 6 | Filesystem | sys_pread64() | 547*b7cb8405SShuah Khan+-------------------+-----------+-----------------+----------------------+ 548*b7cb8405SShuah Khan| access | 1 | Filesystem | sys_access() | 549*b7cb8405SShuah Khan+-------------------+-----------+-----------------+----------------------+ 550*b7cb8405SShuah Khan| pipe | 1 | Filesystem | sys_pipe() | 551*b7cb8405SShuah Khan+-------------------+-----------+-----------------+----------------------+ 552*b7cb8405SShuah Khan| dup2 | 24 | Filesystem | sys_dup2() | 553*b7cb8405SShuah Khan+-------------------+-----------+-----------------+----------------------+ 554*b7cb8405SShuah Khan| execve | 1 | Filesystem | sys_execve() | 555*b7cb8405SShuah Khan+-------------------+-----------+-----------------+----------------------+ 556*b7cb8405SShuah Khan| fcntl | 26 | Filesystem | sys_fcntl() | 557*b7cb8405SShuah Khan+-------------------+-----------+-----------------+----------------------+ 558*b7cb8405SShuah Khan| openat | 14 | Filesystem | sys_openat() | 559*b7cb8405SShuah Khan+-------------------+-----------+-----------------+----------------------+ 560*b7cb8405SShuah Khan| rt_sigaction | 7 | Signal | sys_rt_sigaction() | 561*b7cb8405SShuah Khan+-------------------+-----------+-----------------+----------------------+ 562*b7cb8405SShuah Khan| rt_sigreturn | 38 | Signal | sys_rt_sigreturn() | 563*b7cb8405SShuah Khan+-------------------+-----------+-----------------+----------------------+ 564*b7cb8405SShuah Khan| clone | 38 | Process Mgmt. | sys_clone() | 565*b7cb8405SShuah Khan+-------------------+-----------+-----------------+----------------------+ 566*b7cb8405SShuah Khan| wait4 | 44 | Time | sys_wait4() | 567*b7cb8405SShuah Khan+-------------------+-----------+-----------------+----------------------+ 568*b7cb8405SShuah Khan| mmap | 7 | Memory Mgmt. | sys_mmap() | 569*b7cb8405SShuah Khan+-------------------+-----------+-----------------+----------------------+ 570*b7cb8405SShuah Khan| mprotect | 3 | Memory Mgmt. | sys_mprotect() | 571*b7cb8405SShuah Khan+-------------------+-----------+-----------------+----------------------+ 572*b7cb8405SShuah Khan| munmap | 1 | Memory Mgmt. | sys_munmap() | 573*b7cb8405SShuah Khan+-------------------+-----------+-----------------+----------------------+ 574*b7cb8405SShuah Khan| brk | 3 | Memory Mgmt. | sys_brk() | 575*b7cb8405SShuah Khan+-------------------+-----------+-----------------+----------------------+ 576*b7cb8405SShuah Khan| getpid | 1 | Process Mgmt. | sys_getpid() | 577*b7cb8405SShuah Khan+-------------------+-----------+-----------------+----------------------+ 578*b7cb8405SShuah Khan| getuid | 1 | Process Mgmt. | sys_getuid() | 579*b7cb8405SShuah Khan+-------------------+-----------+-----------------+----------------------+ 580*b7cb8405SShuah Khan| getgid | 1 | Process Mgmt. | sys_getgid() | 581*b7cb8405SShuah Khan+-------------------+-----------+-----------------+----------------------+ 582*b7cb8405SShuah Khan| geteuid | 2 | Process Mgmt. | sys_geteuid() | 583*b7cb8405SShuah Khan+-------------------+-----------+-----------------+----------------------+ 584*b7cb8405SShuah Khan| getegid | 1 | Process Mgmt. | sys_getegid() | 585*b7cb8405SShuah Khan+-------------------+-----------+-----------------+----------------------+ 586*b7cb8405SShuah Khan| getppid | 1 | Process Mgmt. | sys_getppid() | 587*b7cb8405SShuah Khan+-------------------+-----------+-----------------+----------------------+ 588*b7cb8405SShuah Khan| arch_prctl | 2 | Process Mgmt. | sys_arch_prctl() | 589*b7cb8405SShuah Khan+-------------------+-----------+-----------------+----------------------+ 590*b7cb8405SShuah Khan 591*b7cb8405SShuah KhanConclusion 592*b7cb8405SShuah Khan========== 593*b7cb8405SShuah Khan 594*b7cb8405SShuah KhanThis document is intended to be used as a guide on how to gather fine-grained 595*b7cb8405SShuah Khaninformation on the resources in use by workloads using strace. 596*b7cb8405SShuah Khan 597*b7cb8405SShuah KhanReferences 598*b7cb8405SShuah Khan========== 599*b7cb8405SShuah Khan 600*b7cb8405SShuah Khan * `Discovery Linux Kernel Subsystems used by OpenAPS <https://elisa.tech/blog/2022/02/02/discovery-linux-kernel-subsystems-used-by-openaps>`_ 601*b7cb8405SShuah Khan * `ELISA-White-Papers-Discovering Linux kernel subsystems used by a workload <https://github.com/elisa-tech/ELISA-White-Papers/blob/master/Processes/Discovering_Linux_kernel_subsystems_used_by_a_workload.md>`_ 602*b7cb8405SShuah Khan * `strace <https://man7.org/linux/man-pages/man1/strace.1.html>`_ 603*b7cb8405SShuah Khan * `perf <https://man7.org/linux/man-pages/man1/perf.1.html>`_ 604*b7cb8405SShuah Khan * `paxtest README <https://github.com/opntr/paxtest-freebsd/blob/hardenedbsd/0.9.14-hbsd/README>`_ 605*b7cb8405SShuah Khan * `stress-ng <https://www.mankier.com/1/stress-ng>`_ 606*b7cb8405SShuah Khan * `Monitoring and managing system status and performance <https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/monitoring_and_managing_system_status_and_performance/index>`_ 607