1=========================================== 2Fault injection capabilities infrastructure 3=========================================== 4 5See also drivers/md/md-faulty.c and "every_nth" module option for scsi_debug. 6 7 8Available fault injection capabilities 9-------------------------------------- 10 11- failslab 12 13 injects slab allocation failures. (kmalloc(), kmem_cache_alloc(), ...) 14 15- fail_page_alloc 16 17 injects page allocation failures. (alloc_pages(), get_free_pages(), ...) 18 19- fail_futex 20 21 injects futex deadlock and uaddr fault errors. 22 23- fail_make_request 24 25 injects disk IO errors on devices permitted by setting 26 /sys/block/<device>/make-it-fail or 27 /sys/block/<device>/<partition>/make-it-fail. (generic_make_request()) 28 29- fail_mmc_request 30 31 injects MMC data errors on devices permitted by setting 32 debugfs entries under /sys/kernel/debug/mmc0/fail_mmc_request 33 34- fail_function 35 36 injects error return on specific functions, which are marked by 37 ALLOW_ERROR_INJECTION() macro, by setting debugfs entries 38 under /sys/kernel/debug/fail_function. No boot option supported. 39 40- NVMe fault injection 41 42 inject NVMe status code and retry flag on devices permitted by setting 43 debugfs entries under /sys/kernel/debug/nvme*/fault_inject. The default 44 status code is NVME_SC_INVALID_OPCODE with no retry. The status code and 45 retry flag can be set via the debugfs. 46 47 48Configure fault-injection capabilities behavior 49----------------------------------------------- 50 51debugfs entries 52^^^^^^^^^^^^^^^ 53 54fault-inject-debugfs kernel module provides some debugfs entries for runtime 55configuration of fault-injection capabilities. 56 57- /sys/kernel/debug/fail*/probability: 58 59 likelihood of failure injection, in percent. 60 61 Format: <percent> 62 63 Note that one-failure-per-hundred is a very high error rate 64 for some testcases. Consider setting probability=100 and configure 65 /sys/kernel/debug/fail*/interval for such testcases. 66 67- /sys/kernel/debug/fail*/interval: 68 69 specifies the interval between failures, for calls to 70 should_fail() that pass all the other tests. 71 72 Note that if you enable this, by setting interval>1, you will 73 probably want to set probability=100. 74 75- /sys/kernel/debug/fail*/times: 76 77 specifies how many times failures may happen at most. 78 A value of -1 means "no limit". 79 80- /sys/kernel/debug/fail*/space: 81 82 specifies an initial resource "budget", decremented by "size" 83 on each call to should_fail(,size). Failure injection is 84 suppressed until "space" reaches zero. 85 86- /sys/kernel/debug/fail*/verbose 87 88 Format: { 0 | 1 | 2 } 89 90 specifies the verbosity of the messages when failure is 91 injected. '0' means no messages; '1' will print only a single 92 log line per failure; '2' will print a call trace too -- useful 93 to debug the problems revealed by fault injection. 94 95- /sys/kernel/debug/fail*/task-filter: 96 97 Format: { 'Y' | 'N' } 98 99 A value of 'N' disables filtering by process (default). 100 Any positive value limits failures to only processes indicated by 101 /proc/<pid>/make-it-fail==1. 102 103- /sys/kernel/debug/fail*/require-start, 104 /sys/kernel/debug/fail*/require-end, 105 /sys/kernel/debug/fail*/reject-start, 106 /sys/kernel/debug/fail*/reject-end: 107 108 specifies the range of virtual addresses tested during 109 stacktrace walking. Failure is injected only if some caller 110 in the walked stacktrace lies within the required range, and 111 none lies within the rejected range. 112 Default required range is [0,ULONG_MAX) (whole of virtual address space). 113 Default rejected range is [0,0). 114 115- /sys/kernel/debug/fail*/stacktrace-depth: 116 117 specifies the maximum stacktrace depth walked during search 118 for a caller within [require-start,require-end) OR 119 [reject-start,reject-end). 120 121- /sys/kernel/debug/fail_page_alloc/ignore-gfp-highmem: 122 123 Format: { 'Y' | 'N' } 124 125 default is 'N', setting it to 'Y' won't inject failures into 126 highmem/user allocations. 127 128- /sys/kernel/debug/failslab/ignore-gfp-wait: 129- /sys/kernel/debug/fail_page_alloc/ignore-gfp-wait: 130 131 Format: { 'Y' | 'N' } 132 133 default is 'N', setting it to 'Y' will inject failures 134 only into non-sleep allocations (GFP_ATOMIC allocations). 135 136- /sys/kernel/debug/fail_page_alloc/min-order: 137 138 specifies the minimum page allocation order to be injected 139 failures. 140 141- /sys/kernel/debug/fail_futex/ignore-private: 142 143 Format: { 'Y' | 'N' } 144 145 default is 'N', setting it to 'Y' will disable failure injections 146 when dealing with private (address space) futexes. 147 148- /sys/kernel/debug/fail_function/inject: 149 150 Format: { 'function-name' | '!function-name' | '' } 151 152 specifies the target function of error injection by name. 153 If the function name leads '!' prefix, given function is 154 removed from injection list. If nothing specified ('') 155 injection list is cleared. 156 157- /sys/kernel/debug/fail_function/injectable: 158 159 (read only) shows error injectable functions and what type of 160 error values can be specified. The error type will be one of 161 below; 162 - NULL: retval must be 0. 163 - ERRNO: retval must be -1 to -MAX_ERRNO (-4096). 164 - ERR_NULL: retval must be 0 or -1 to -MAX_ERRNO (-4096). 165 166- /sys/kernel/debug/fail_function/<functiuon-name>/retval: 167 168 specifies the "error" return value to inject to the given 169 function for given function. This will be created when 170 user specifies new injection entry. 171 172Boot option 173^^^^^^^^^^^ 174 175In order to inject faults while debugfs is not available (early boot time), 176use the boot option:: 177 178 failslab= 179 fail_page_alloc= 180 fail_make_request= 181 fail_futex= 182 mmc_core.fail_request=<interval>,<probability>,<space>,<times> 183 184proc entries 185^^^^^^^^^^^^ 186 187- /proc/<pid>/fail-nth, 188 /proc/self/task/<tid>/fail-nth: 189 190 Write to this file of integer N makes N-th call in the task fail. 191 Read from this file returns a integer value. A value of '0' indicates 192 that the fault setup with a previous write to this file was injected. 193 A positive integer N indicates that the fault wasn't yet injected. 194 Note that this file enables all types of faults (slab, futex, etc). 195 This setting takes precedence over all other generic debugfs settings 196 like probability, interval, times, etc. But per-capability settings 197 (e.g. fail_futex/ignore-private) take precedence over it. 198 199 This feature is intended for systematic testing of faults in a single 200 system call. See an example below. 201 202How to add new fault injection capability 203----------------------------------------- 204 205- #include <linux/fault-inject.h> 206 207- define the fault attributes 208 209 DECLARE_FAULT_ATTR(name); 210 211 Please see the definition of struct fault_attr in fault-inject.h 212 for details. 213 214- provide a way to configure fault attributes 215 216- boot option 217 218 If you need to enable the fault injection capability from boot time, you can 219 provide boot option to configure it. There is a helper function for it: 220 221 setup_fault_attr(attr, str); 222 223- debugfs entries 224 225 failslab, fail_page_alloc, and fail_make_request use this way. 226 Helper functions: 227 228 fault_create_debugfs_attr(name, parent, attr); 229 230- module parameters 231 232 If the scope of the fault injection capability is limited to a 233 single kernel module, it is better to provide module parameters to 234 configure the fault attributes. 235 236- add a hook to insert failures 237 238 Upon should_fail() returning true, client code should inject a failure: 239 240 should_fail(attr, size); 241 242Application Examples 243-------------------- 244 245- Inject slab allocation failures into module init/exit code:: 246 247 #!/bin/bash 248 249 FAILTYPE=failslab 250 echo Y > /sys/kernel/debug/$FAILTYPE/task-filter 251 echo 10 > /sys/kernel/debug/$FAILTYPE/probability 252 echo 100 > /sys/kernel/debug/$FAILTYPE/interval 253 echo -1 > /sys/kernel/debug/$FAILTYPE/times 254 echo 0 > /sys/kernel/debug/$FAILTYPE/space 255 echo 2 > /sys/kernel/debug/$FAILTYPE/verbose 256 echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-wait 257 258 faulty_system() 259 { 260 bash -c "echo 1 > /proc/self/make-it-fail && exec $*" 261 } 262 263 if [ $# -eq 0 ] 264 then 265 echo "Usage: $0 modulename [ modulename ... ]" 266 exit 1 267 fi 268 269 for m in $* 270 do 271 echo inserting $m... 272 faulty_system modprobe $m 273 274 echo removing $m... 275 faulty_system modprobe -r $m 276 done 277 278------------------------------------------------------------------------------ 279 280- Inject page allocation failures only for a specific module:: 281 282 #!/bin/bash 283 284 FAILTYPE=fail_page_alloc 285 module=$1 286 287 if [ -z $module ] 288 then 289 echo "Usage: $0 <modulename>" 290 exit 1 291 fi 292 293 modprobe $module 294 295 if [ ! -d /sys/module/$module/sections ] 296 then 297 echo Module $module is not loaded 298 exit 1 299 fi 300 301 cat /sys/module/$module/sections/.text > /sys/kernel/debug/$FAILTYPE/require-start 302 cat /sys/module/$module/sections/.data > /sys/kernel/debug/$FAILTYPE/require-end 303 304 echo N > /sys/kernel/debug/$FAILTYPE/task-filter 305 echo 10 > /sys/kernel/debug/$FAILTYPE/probability 306 echo 100 > /sys/kernel/debug/$FAILTYPE/interval 307 echo -1 > /sys/kernel/debug/$FAILTYPE/times 308 echo 0 > /sys/kernel/debug/$FAILTYPE/space 309 echo 2 > /sys/kernel/debug/$FAILTYPE/verbose 310 echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-wait 311 echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-highmem 312 echo 10 > /sys/kernel/debug/$FAILTYPE/stacktrace-depth 313 314 trap "echo 0 > /sys/kernel/debug/$FAILTYPE/probability" SIGINT SIGTERM EXIT 315 316 echo "Injecting errors into the module $module... (interrupt to stop)" 317 sleep 1000000 318 319------------------------------------------------------------------------------ 320 321- Inject open_ctree error while btrfs mount:: 322 323 #!/bin/bash 324 325 rm -f testfile.img 326 dd if=/dev/zero of=testfile.img bs=1M seek=1000 count=1 327 DEVICE=$(losetup --show -f testfile.img) 328 mkfs.btrfs -f $DEVICE 329 mkdir -p tmpmnt 330 331 FAILTYPE=fail_function 332 FAILFUNC=open_ctree 333 echo $FAILFUNC > /sys/kernel/debug/$FAILTYPE/inject 334 echo -12 > /sys/kernel/debug/$FAILTYPE/$FAILFUNC/retval 335 echo N > /sys/kernel/debug/$FAILTYPE/task-filter 336 echo 100 > /sys/kernel/debug/$FAILTYPE/probability 337 echo 0 > /sys/kernel/debug/$FAILTYPE/interval 338 echo -1 > /sys/kernel/debug/$FAILTYPE/times 339 echo 0 > /sys/kernel/debug/$FAILTYPE/space 340 echo 1 > /sys/kernel/debug/$FAILTYPE/verbose 341 342 mount -t btrfs $DEVICE tmpmnt 343 if [ $? -ne 0 ] 344 then 345 echo "SUCCESS!" 346 else 347 echo "FAILED!" 348 umount tmpmnt 349 fi 350 351 echo > /sys/kernel/debug/$FAILTYPE/inject 352 353 rmdir tmpmnt 354 losetup -d $DEVICE 355 rm testfile.img 356 357 358Tool to run command with failslab or fail_page_alloc 359---------------------------------------------------- 360In order to make it easier to accomplish the tasks mentioned above, we can use 361tools/testing/fault-injection/failcmd.sh. Please run a command 362"./tools/testing/fault-injection/failcmd.sh --help" for more information and 363see the following examples. 364 365Examples: 366 367Run a command "make -C tools/testing/selftests/ run_tests" with injecting slab 368allocation failure:: 369 370 # ./tools/testing/fault-injection/failcmd.sh \ 371 -- make -C tools/testing/selftests/ run_tests 372 373Same as above except to specify 100 times failures at most instead of one time 374at most by default:: 375 376 # ./tools/testing/fault-injection/failcmd.sh --times=100 \ 377 -- make -C tools/testing/selftests/ run_tests 378 379Same as above except to inject page allocation failure instead of slab 380allocation failure:: 381 382 # env FAILCMD_TYPE=fail_page_alloc \ 383 ./tools/testing/fault-injection/failcmd.sh --times=100 \ 384 -- make -C tools/testing/selftests/ run_tests 385 386Systematic faults using fail-nth 387--------------------------------- 388 389The following code systematically faults 0-th, 1-st, 2-nd and so on 390capabilities in the socketpair() system call:: 391 392 #include <sys/types.h> 393 #include <sys/stat.h> 394 #include <sys/socket.h> 395 #include <sys/syscall.h> 396 #include <fcntl.h> 397 #include <unistd.h> 398 #include <string.h> 399 #include <stdlib.h> 400 #include <stdio.h> 401 #include <errno.h> 402 403 int main() 404 { 405 int i, err, res, fail_nth, fds[2]; 406 char buf[128]; 407 408 system("echo N > /sys/kernel/debug/failslab/ignore-gfp-wait"); 409 sprintf(buf, "/proc/self/task/%ld/fail-nth", syscall(SYS_gettid)); 410 fail_nth = open(buf, O_RDWR); 411 for (i = 1;; i++) { 412 sprintf(buf, "%d", i); 413 write(fail_nth, buf, strlen(buf)); 414 res = socketpair(AF_LOCAL, SOCK_STREAM, 0, fds); 415 err = errno; 416 pread(fail_nth, buf, sizeof(buf), 0); 417 if (res == 0) { 418 close(fds[0]); 419 close(fds[1]); 420 } 421 printf("%d-th fault %c: res=%d/%d\n", i, atoi(buf) ? 'N' : 'Y', 422 res, err); 423 if (atoi(buf)) 424 break; 425 } 426 return 0; 427 } 428 429An example output:: 430 431 1-th fault Y: res=-1/23 432 2-th fault Y: res=-1/23 433 3-th fault Y: res=-1/12 434 4-th fault Y: res=-1/12 435 5-th fault Y: res=-1/23 436 6-th fault Y: res=-1/23 437 7-th fault Y: res=-1/23 438 8-th fault Y: res=-1/12 439 9-th fault Y: res=-1/12 440 10-th fault Y: res=-1/12 441 11-th fault Y: res=-1/12 442 12-th fault Y: res=-1/12 443 13-th fault Y: res=-1/12 444 14-th fault Y: res=-1/12 445 15-th fault Y: res=-1/12 446 16-th fault N: res=0/12 447