1===========================================
2Fault injection capabilities infrastructure
3===========================================
4
5See also drivers/md/md-faulty.c and "every_nth" module option for scsi_debug.
6
7
8Available fault injection capabilities
9--------------------------------------
10
11- failslab
12
13  injects slab allocation failures. (kmalloc(), kmem_cache_alloc(), ...)
14
15- fail_page_alloc
16
17  injects page allocation failures. (alloc_pages(), get_free_pages(), ...)
18
19- fail_usercopy
20
21  injects failures in user memory access functions. (copy_from_user(), get_user(), ...)
22
23- fail_futex
24
25  injects futex deadlock and uaddr fault errors.
26
27- fail_sunrpc
28
29  injects kernel RPC client and server failures.
30
31- fail_make_request
32
33  injects disk IO errors on devices permitted by setting
34  /sys/block/<device>/make-it-fail or
35  /sys/block/<device>/<partition>/make-it-fail. (submit_bio_noacct())
36
37- fail_mmc_request
38
39  injects MMC data errors on devices permitted by setting
40  debugfs entries under /sys/kernel/debug/mmc0/fail_mmc_request
41
42- fail_function
43
44  injects error return on specific functions, which are marked by
45  ALLOW_ERROR_INJECTION() macro, by setting debugfs entries
46  under /sys/kernel/debug/fail_function. No boot option supported.
47
48- NVMe fault injection
49
50  inject NVMe status code and retry flag on devices permitted by setting
51  debugfs entries under /sys/kernel/debug/nvme*/fault_inject. The default
52  status code is NVME_SC_INVALID_OPCODE with no retry. The status code and
53  retry flag can be set via the debugfs.
54
55
56Configure fault-injection capabilities behavior
57-----------------------------------------------
58
59debugfs entries
60^^^^^^^^^^^^^^^
61
62fault-inject-debugfs kernel module provides some debugfs entries for runtime
63configuration of fault-injection capabilities.
64
65- /sys/kernel/debug/fail*/probability:
66
67	likelihood of failure injection, in percent.
68
69	Format: <percent>
70
71	Note that one-failure-per-hundred is a very high error rate
72	for some testcases.  Consider setting probability=100 and configure
73	/sys/kernel/debug/fail*/interval for such testcases.
74
75- /sys/kernel/debug/fail*/interval:
76
77	specifies the interval between failures, for calls to
78	should_fail() that pass all the other tests.
79
80	Note that if you enable this, by setting interval>1, you will
81	probably want to set probability=100.
82
83- /sys/kernel/debug/fail*/times:
84
85	specifies how many times failures may happen at most. A value of -1
86	means "no limit". Note, though, that this file only accepts unsigned
87	values. So, if you want to specify -1, you better use 'printf' instead
88	of 'echo', e.g.: $ printf %#x -1 > times
89
90- /sys/kernel/debug/fail*/space:
91
92	specifies an initial resource "budget", decremented by "size"
93	on each call to should_fail(,size).  Failure injection is
94	suppressed until "space" reaches zero.
95
96- /sys/kernel/debug/fail*/verbose
97
98	Format: { 0 | 1 | 2 }
99
100	specifies the verbosity of the messages when failure is
101	injected.  '0' means no messages; '1' will print only a single
102	log line per failure; '2' will print a call trace too -- useful
103	to debug the problems revealed by fault injection.
104
105- /sys/kernel/debug/fail*/task-filter:
106
107	Format: { 'Y' | 'N' }
108
109	A value of 'N' disables filtering by process (default).
110	Any positive value limits failures to only processes indicated by
111	/proc/<pid>/make-it-fail==1.
112
113- /sys/kernel/debug/fail*/require-start,
114  /sys/kernel/debug/fail*/require-end,
115  /sys/kernel/debug/fail*/reject-start,
116  /sys/kernel/debug/fail*/reject-end:
117
118	specifies the range of virtual addresses tested during
119	stacktrace walking.  Failure is injected only if some caller
120	in the walked stacktrace lies within the required range, and
121	none lies within the rejected range.
122	Default required range is [0,ULONG_MAX) (whole of virtual address space).
123	Default rejected range is [0,0).
124
125- /sys/kernel/debug/fail*/stacktrace-depth:
126
127	specifies the maximum stacktrace depth walked during search
128	for a caller within [require-start,require-end) OR
129	[reject-start,reject-end).
130
131- /sys/kernel/debug/fail_page_alloc/ignore-gfp-highmem:
132
133	Format: { 'Y' | 'N' }
134
135	default is 'Y', setting it to 'N' will also inject failures into
136	highmem/user allocations (__GFP_HIGHMEM allocations).
137
138- /sys/kernel/debug/failslab/ignore-gfp-wait:
139- /sys/kernel/debug/fail_page_alloc/ignore-gfp-wait:
140
141	Format: { 'Y' | 'N' }
142
143	default is 'Y', setting it to 'N' will also inject failures
144	into allocations that can sleep (__GFP_DIRECT_RECLAIM allocations).
145
146- /sys/kernel/debug/fail_page_alloc/min-order:
147
148	specifies the minimum page allocation order to be injected
149	failures.
150
151- /sys/kernel/debug/fail_futex/ignore-private:
152
153	Format: { 'Y' | 'N' }
154
155	default is 'N', setting it to 'Y' will disable failure injections
156	when dealing with private (address space) futexes.
157
158- /sys/kernel/debug/fail_sunrpc/ignore-client-disconnect:
159
160	Format: { 'Y' | 'N' }
161
162	default is 'N', setting it to 'Y' will disable disconnect
163	injection on the RPC client.
164
165- /sys/kernel/debug/fail_sunrpc/ignore-server-disconnect:
166
167	Format: { 'Y' | 'N' }
168
169	default is 'N', setting it to 'Y' will disable disconnect
170	injection on the RPC server.
171
172- /sys/kernel/debug/fail_sunrpc/ignore-cache-wait:
173
174	Format: { 'Y' | 'N' }
175
176	default is 'N', setting it to 'Y' will disable cache wait
177	injection on the RPC server.
178
179- /sys/kernel/debug/fail_function/inject:
180
181	Format: { 'function-name' | '!function-name' | '' }
182
183	specifies the target function of error injection by name.
184	If the function name leads '!' prefix, given function is
185	removed from injection list. If nothing specified ('')
186	injection list is cleared.
187
188- /sys/kernel/debug/fail_function/injectable:
189
190	(read only) shows error injectable functions and what type of
191	error values can be specified. The error type will be one of
192	below;
193	- NULL:	retval must be 0.
194	- ERRNO: retval must be -1 to -MAX_ERRNO (-4096).
195	- ERR_NULL: retval must be 0 or -1 to -MAX_ERRNO (-4096).
196
197- /sys/kernel/debug/fail_function/<function-name>/retval:
198
199	specifies the "error" return value to inject to the given function.
200	This will be created when the user specifies a new injection entry.
201	Note that this file only accepts unsigned values. So, if you want to
202	use a negative errno, you better use 'printf' instead of 'echo', e.g.:
203	$ printf %#x -12 > retval
204
205Boot option
206^^^^^^^^^^^
207
208In order to inject faults while debugfs is not available (early boot time),
209use the boot option::
210
211	failslab=
212	fail_page_alloc=
213	fail_usercopy=
214	fail_make_request=
215	fail_futex=
216	mmc_core.fail_request=<interval>,<probability>,<space>,<times>
217
218proc entries
219^^^^^^^^^^^^
220
221- /proc/<pid>/fail-nth,
222  /proc/self/task/<tid>/fail-nth:
223
224	Write to this file of integer N makes N-th call in the task fail.
225	Read from this file returns a integer value. A value of '0' indicates
226	that the fault setup with a previous write to this file was injected.
227	A positive integer N indicates that the fault wasn't yet injected.
228	Note that this file enables all types of faults (slab, futex, etc).
229	This setting takes precedence over all other generic debugfs settings
230	like probability, interval, times, etc. But per-capability settings
231	(e.g. fail_futex/ignore-private) take precedence over it.
232
233	This feature is intended for systematic testing of faults in a single
234	system call. See an example below.
235
236How to add new fault injection capability
237-----------------------------------------
238
239- #include <linux/fault-inject.h>
240
241- define the fault attributes
242
243  DECLARE_FAULT_ATTR(name);
244
245  Please see the definition of struct fault_attr in fault-inject.h
246  for details.
247
248- provide a way to configure fault attributes
249
250- boot option
251
252  If you need to enable the fault injection capability from boot time, you can
253  provide boot option to configure it. There is a helper function for it:
254
255	setup_fault_attr(attr, str);
256
257- debugfs entries
258
259  failslab, fail_page_alloc, fail_usercopy, and fail_make_request use this way.
260  Helper functions:
261
262	fault_create_debugfs_attr(name, parent, attr);
263
264- module parameters
265
266  If the scope of the fault injection capability is limited to a
267  single kernel module, it is better to provide module parameters to
268  configure the fault attributes.
269
270- add a hook to insert failures
271
272  Upon should_fail() returning true, client code should inject a failure:
273
274	should_fail(attr, size);
275
276Application Examples
277--------------------
278
279- Inject slab allocation failures into module init/exit code::
280
281    #!/bin/bash
282
283    FAILTYPE=failslab
284    echo Y > /sys/kernel/debug/$FAILTYPE/task-filter
285    echo 10 > /sys/kernel/debug/$FAILTYPE/probability
286    echo 100 > /sys/kernel/debug/$FAILTYPE/interval
287    printf %#x -1 > /sys/kernel/debug/$FAILTYPE/times
288    echo 0 > /sys/kernel/debug/$FAILTYPE/space
289    echo 2 > /sys/kernel/debug/$FAILTYPE/verbose
290    echo Y > /sys/kernel/debug/$FAILTYPE/ignore-gfp-wait
291
292    faulty_system()
293    {
294	bash -c "echo 1 > /proc/self/make-it-fail && exec $*"
295    }
296
297    if [ $# -eq 0 ]
298    then
299	echo "Usage: $0 modulename [ modulename ... ]"
300	exit 1
301    fi
302
303    for m in $*
304    do
305	echo inserting $m...
306	faulty_system modprobe $m
307
308	echo removing $m...
309	faulty_system modprobe -r $m
310    done
311
312------------------------------------------------------------------------------
313
314- Inject page allocation failures only for a specific module::
315
316    #!/bin/bash
317
318    FAILTYPE=fail_page_alloc
319    module=$1
320
321    if [ -z $module ]
322    then
323	echo "Usage: $0 <modulename>"
324	exit 1
325    fi
326
327    modprobe $module
328
329    if [ ! -d /sys/module/$module/sections ]
330    then
331	echo Module $module is not loaded
332	exit 1
333    fi
334
335    cat /sys/module/$module/sections/.text > /sys/kernel/debug/$FAILTYPE/require-start
336    cat /sys/module/$module/sections/.data > /sys/kernel/debug/$FAILTYPE/require-end
337
338    echo N > /sys/kernel/debug/$FAILTYPE/task-filter
339    echo 10 > /sys/kernel/debug/$FAILTYPE/probability
340    echo 100 > /sys/kernel/debug/$FAILTYPE/interval
341    printf %#x -1 > /sys/kernel/debug/$FAILTYPE/times
342    echo 0 > /sys/kernel/debug/$FAILTYPE/space
343    echo 2 > /sys/kernel/debug/$FAILTYPE/verbose
344    echo Y > /sys/kernel/debug/$FAILTYPE/ignore-gfp-wait
345    echo Y > /sys/kernel/debug/$FAILTYPE/ignore-gfp-highmem
346    echo 10 > /sys/kernel/debug/$FAILTYPE/stacktrace-depth
347
348    trap "echo 0 > /sys/kernel/debug/$FAILTYPE/probability" SIGINT SIGTERM EXIT
349
350    echo "Injecting errors into the module $module... (interrupt to stop)"
351    sleep 1000000
352
353------------------------------------------------------------------------------
354
355- Inject open_ctree error while btrfs mount::
356
357    #!/bin/bash
358
359    rm -f testfile.img
360    dd if=/dev/zero of=testfile.img bs=1M seek=1000 count=1
361    DEVICE=$(losetup --show -f testfile.img)
362    mkfs.btrfs -f $DEVICE
363    mkdir -p tmpmnt
364
365    FAILTYPE=fail_function
366    FAILFUNC=open_ctree
367    echo $FAILFUNC > /sys/kernel/debug/$FAILTYPE/inject
368    printf %#x -12 > /sys/kernel/debug/$FAILTYPE/$FAILFUNC/retval
369    echo N > /sys/kernel/debug/$FAILTYPE/task-filter
370    echo 100 > /sys/kernel/debug/$FAILTYPE/probability
371    echo 0 > /sys/kernel/debug/$FAILTYPE/interval
372    printf %#x -1 > /sys/kernel/debug/$FAILTYPE/times
373    echo 0 > /sys/kernel/debug/$FAILTYPE/space
374    echo 1 > /sys/kernel/debug/$FAILTYPE/verbose
375
376    mount -t btrfs $DEVICE tmpmnt
377    if [ $? -ne 0 ]
378    then
379	echo "SUCCESS!"
380    else
381	echo "FAILED!"
382	umount tmpmnt
383    fi
384
385    echo > /sys/kernel/debug/$FAILTYPE/inject
386
387    rmdir tmpmnt
388    losetup -d $DEVICE
389    rm testfile.img
390
391
392Tool to run command with failslab or fail_page_alloc
393----------------------------------------------------
394In order to make it easier to accomplish the tasks mentioned above, we can use
395tools/testing/fault-injection/failcmd.sh.  Please run a command
396"./tools/testing/fault-injection/failcmd.sh --help" for more information and
397see the following examples.
398
399Examples:
400
401Run a command "make -C tools/testing/selftests/ run_tests" with injecting slab
402allocation failure::
403
404	# ./tools/testing/fault-injection/failcmd.sh \
405		-- make -C tools/testing/selftests/ run_tests
406
407Same as above except to specify 100 times failures at most instead of one time
408at most by default::
409
410	# ./tools/testing/fault-injection/failcmd.sh --times=100 \
411		-- make -C tools/testing/selftests/ run_tests
412
413Same as above except to inject page allocation failure instead of slab
414allocation failure::
415
416	# env FAILCMD_TYPE=fail_page_alloc \
417		./tools/testing/fault-injection/failcmd.sh --times=100 \
418		-- make -C tools/testing/selftests/ run_tests
419
420Systematic faults using fail-nth
421---------------------------------
422
423The following code systematically faults 0-th, 1-st, 2-nd and so on
424capabilities in the socketpair() system call::
425
426  #include <sys/types.h>
427  #include <sys/stat.h>
428  #include <sys/socket.h>
429  #include <sys/syscall.h>
430  #include <fcntl.h>
431  #include <unistd.h>
432  #include <string.h>
433  #include <stdlib.h>
434  #include <stdio.h>
435  #include <errno.h>
436
437  int main()
438  {
439	int i, err, res, fail_nth, fds[2];
440	char buf[128];
441
442	system("echo N > /sys/kernel/debug/failslab/ignore-gfp-wait");
443	sprintf(buf, "/proc/self/task/%ld/fail-nth", syscall(SYS_gettid));
444	fail_nth = open(buf, O_RDWR);
445	for (i = 1;; i++) {
446		sprintf(buf, "%d", i);
447		write(fail_nth, buf, strlen(buf));
448		res = socketpair(AF_LOCAL, SOCK_STREAM, 0, fds);
449		err = errno;
450		pread(fail_nth, buf, sizeof(buf), 0);
451		if (res == 0) {
452			close(fds[0]);
453			close(fds[1]);
454		}
455		printf("%d-th fault %c: res=%d/%d\n", i, atoi(buf) ? 'N' : 'Y',
456			res, err);
457		if (atoi(buf))
458			break;
459	}
460	return 0;
461  }
462
463An example output::
464
465	1-th fault Y: res=-1/23
466	2-th fault Y: res=-1/23
467	3-th fault Y: res=-1/12
468	4-th fault Y: res=-1/12
469	5-th fault Y: res=-1/23
470	6-th fault Y: res=-1/23
471	7-th fault Y: res=-1/23
472	8-th fault Y: res=-1/12
473	9-th fault Y: res=-1/12
474	10-th fault Y: res=-1/12
475	11-th fault Y: res=-1/12
476	12-th fault Y: res=-1/12
477	13-th fault Y: res=-1/12
478	14-th fault Y: res=-1/12
479	15-th fault Y: res=-1/12
480	16-th fault N: res=0/12
481