xref: /openbmc/linux/Documentation/dev-tools/kmsan.rst (revision 93858ae70cf4fb2ec75ae2f1e495b85b26614883)
1*93858ae7SAlexander Potapenko.. SPDX-License-Identifier: GPL-2.0
2*93858ae7SAlexander Potapenko.. Copyright (C) 2022, Google LLC.
3*93858ae7SAlexander Potapenko
4*93858ae7SAlexander Potapenko===================================
5*93858ae7SAlexander PotapenkoThe Kernel Memory Sanitizer (KMSAN)
6*93858ae7SAlexander Potapenko===================================
7*93858ae7SAlexander Potapenko
8*93858ae7SAlexander PotapenkoKMSAN is a dynamic error detector aimed at finding uses of uninitialized
9*93858ae7SAlexander Potapenkovalues. It is based on compiler instrumentation, and is quite similar to the
10*93858ae7SAlexander Potapenkouserspace `MemorySanitizer tool`_.
11*93858ae7SAlexander Potapenko
12*93858ae7SAlexander PotapenkoAn important note is that KMSAN is not intended for production use, because it
13*93858ae7SAlexander Potapenkodrastically increases kernel memory footprint and slows the whole system down.
14*93858ae7SAlexander Potapenko
15*93858ae7SAlexander PotapenkoUsage
16*93858ae7SAlexander Potapenko=====
17*93858ae7SAlexander Potapenko
18*93858ae7SAlexander PotapenkoBuilding the kernel
19*93858ae7SAlexander Potapenko-------------------
20*93858ae7SAlexander Potapenko
21*93858ae7SAlexander PotapenkoIn order to build a kernel with KMSAN you will need a fresh Clang (14.0.6+).
22*93858ae7SAlexander PotapenkoPlease refer to `LLVM documentation`_ for the instructions on how to build Clang.
23*93858ae7SAlexander Potapenko
24*93858ae7SAlexander PotapenkoNow configure and build the kernel with CONFIG_KMSAN enabled.
25*93858ae7SAlexander Potapenko
26*93858ae7SAlexander PotapenkoExample report
27*93858ae7SAlexander Potapenko--------------
28*93858ae7SAlexander Potapenko
29*93858ae7SAlexander PotapenkoHere is an example of a KMSAN report::
30*93858ae7SAlexander Potapenko
31*93858ae7SAlexander Potapenko  =====================================================
32*93858ae7SAlexander Potapenko  BUG: KMSAN: uninit-value in test_uninit_kmsan_check_memory+0x1be/0x380 [kmsan_test]
33*93858ae7SAlexander Potapenko   test_uninit_kmsan_check_memory+0x1be/0x380 mm/kmsan/kmsan_test.c:273
34*93858ae7SAlexander Potapenko   kunit_run_case_internal lib/kunit/test.c:333
35*93858ae7SAlexander Potapenko   kunit_try_run_case+0x206/0x420 lib/kunit/test.c:374
36*93858ae7SAlexander Potapenko   kunit_generic_run_threadfn_adapter+0x6d/0xc0 lib/kunit/try-catch.c:28
37*93858ae7SAlexander Potapenko   kthread+0x721/0x850 kernel/kthread.c:327
38*93858ae7SAlexander Potapenko   ret_from_fork+0x1f/0x30 ??:?
39*93858ae7SAlexander Potapenko
40*93858ae7SAlexander Potapenko  Uninit was stored to memory at:
41*93858ae7SAlexander Potapenko   do_uninit_local_array+0xfa/0x110 mm/kmsan/kmsan_test.c:260
42*93858ae7SAlexander Potapenko   test_uninit_kmsan_check_memory+0x1a2/0x380 mm/kmsan/kmsan_test.c:271
43*93858ae7SAlexander Potapenko   kunit_run_case_internal lib/kunit/test.c:333
44*93858ae7SAlexander Potapenko   kunit_try_run_case+0x206/0x420 lib/kunit/test.c:374
45*93858ae7SAlexander Potapenko   kunit_generic_run_threadfn_adapter+0x6d/0xc0 lib/kunit/try-catch.c:28
46*93858ae7SAlexander Potapenko   kthread+0x721/0x850 kernel/kthread.c:327
47*93858ae7SAlexander Potapenko   ret_from_fork+0x1f/0x30 ??:?
48*93858ae7SAlexander Potapenko
49*93858ae7SAlexander Potapenko  Local variable uninit created at:
50*93858ae7SAlexander Potapenko   do_uninit_local_array+0x4a/0x110 mm/kmsan/kmsan_test.c:256
51*93858ae7SAlexander Potapenko   test_uninit_kmsan_check_memory+0x1a2/0x380 mm/kmsan/kmsan_test.c:271
52*93858ae7SAlexander Potapenko
53*93858ae7SAlexander Potapenko  Bytes 4-7 of 8 are uninitialized
54*93858ae7SAlexander Potapenko  Memory access of size 8 starts at ffff888083fe3da0
55*93858ae7SAlexander Potapenko
56*93858ae7SAlexander Potapenko  CPU: 0 PID: 6731 Comm: kunit_try_catch Tainted: G    B       E     5.16.0-rc3+ #104
57*93858ae7SAlexander Potapenko  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
58*93858ae7SAlexander Potapenko  =====================================================
59*93858ae7SAlexander Potapenko
60*93858ae7SAlexander PotapenkoThe report says that the local variable ``uninit`` was created uninitialized in
61*93858ae7SAlexander Potapenko``do_uninit_local_array()``. The third stack trace corresponds to the place
62*93858ae7SAlexander Potapenkowhere this variable was created.
63*93858ae7SAlexander Potapenko
64*93858ae7SAlexander PotapenkoThe first stack trace shows where the uninit value was used (in
65*93858ae7SAlexander Potapenko``test_uninit_kmsan_check_memory()``). The tool shows the bytes which were left
66*93858ae7SAlexander Potapenkouninitialized in the local variable, as well as the stack where the value was
67*93858ae7SAlexander Potapenkocopied to another memory location before use.
68*93858ae7SAlexander Potapenko
69*93858ae7SAlexander PotapenkoA use of uninitialized value ``v`` is reported by KMSAN in the following cases:
70*93858ae7SAlexander Potapenko - in a condition, e.g. ``if (v) { ... }``;
71*93858ae7SAlexander Potapenko - in an indexing or pointer dereferencing, e.g. ``array[v]`` or ``*v``;
72*93858ae7SAlexander Potapenko - when it is copied to userspace or hardware, e.g. ``copy_to_user(..., &v, ...)``;
73*93858ae7SAlexander Potapenko - when it is passed as an argument to a function, and
74*93858ae7SAlexander Potapenko   ``CONFIG_KMSAN_CHECK_PARAM_RETVAL`` is enabled (see below).
75*93858ae7SAlexander Potapenko
76*93858ae7SAlexander PotapenkoThe mentioned cases (apart from copying data to userspace or hardware, which is
77*93858ae7SAlexander Potapenkoa security issue) are considered undefined behavior from the C11 Standard point
78*93858ae7SAlexander Potapenkoof view.
79*93858ae7SAlexander Potapenko
80*93858ae7SAlexander PotapenkoDisabling the instrumentation
81*93858ae7SAlexander Potapenko-----------------------------
82*93858ae7SAlexander Potapenko
83*93858ae7SAlexander PotapenkoA function can be marked with ``__no_kmsan_checks``. Doing so makes KMSAN
84*93858ae7SAlexander Potapenkoignore uninitialized values in that function and mark its output as initialized.
85*93858ae7SAlexander PotapenkoAs a result, the user will not get KMSAN reports related to that function.
86*93858ae7SAlexander Potapenko
87*93858ae7SAlexander PotapenkoAnother function attribute supported by KMSAN is ``__no_sanitize_memory``.
88*93858ae7SAlexander PotapenkoApplying this attribute to a function will result in KMSAN not instrumenting
89*93858ae7SAlexander Potapenkoit, which can be helpful if we do not want the compiler to interfere with some
90*93858ae7SAlexander Potapenkolow-level code (e.g. that marked with ``noinstr`` which implicitly adds
91*93858ae7SAlexander Potapenko``__no_sanitize_memory``).
92*93858ae7SAlexander Potapenko
93*93858ae7SAlexander PotapenkoThis however comes at a cost: stack allocations from such functions will have
94*93858ae7SAlexander Potapenkoincorrect shadow/origin values, likely leading to false positives. Functions
95*93858ae7SAlexander Potapenkocalled from non-instrumented code may also receive incorrect metadata for their
96*93858ae7SAlexander Potapenkoparameters.
97*93858ae7SAlexander Potapenko
98*93858ae7SAlexander PotapenkoAs a rule of thumb, avoid using ``__no_sanitize_memory`` explicitly.
99*93858ae7SAlexander Potapenko
100*93858ae7SAlexander PotapenkoIt is also possible to disable KMSAN for a single file (e.g. main.o)::
101*93858ae7SAlexander Potapenko
102*93858ae7SAlexander Potapenko  KMSAN_SANITIZE_main.o := n
103*93858ae7SAlexander Potapenko
104*93858ae7SAlexander Potapenkoor for the whole directory::
105*93858ae7SAlexander Potapenko
106*93858ae7SAlexander Potapenko  KMSAN_SANITIZE := n
107*93858ae7SAlexander Potapenko
108*93858ae7SAlexander Potapenkoin the Makefile. Think of this as applying ``__no_sanitize_memory`` to every
109*93858ae7SAlexander Potapenkofunction in the file or directory. Most users won't need KMSAN_SANITIZE, unless
110*93858ae7SAlexander Potapenkotheir code gets broken by KMSAN (e.g. runs at early boot time).
111*93858ae7SAlexander Potapenko
112*93858ae7SAlexander PotapenkoSupport
113*93858ae7SAlexander Potapenko=======
114*93858ae7SAlexander Potapenko
115*93858ae7SAlexander PotapenkoIn order for KMSAN to work the kernel must be built with Clang, which so far is
116*93858ae7SAlexander Potapenkothe only compiler that has KMSAN support. The kernel instrumentation pass is
117*93858ae7SAlexander Potapenkobased on the userspace `MemorySanitizer tool`_.
118*93858ae7SAlexander Potapenko
119*93858ae7SAlexander PotapenkoThe runtime library only supports x86_64 at the moment.
120*93858ae7SAlexander Potapenko
121*93858ae7SAlexander PotapenkoHow KMSAN works
122*93858ae7SAlexander Potapenko===============
123*93858ae7SAlexander Potapenko
124*93858ae7SAlexander PotapenkoKMSAN shadow memory
125*93858ae7SAlexander Potapenko-------------------
126*93858ae7SAlexander Potapenko
127*93858ae7SAlexander PotapenkoKMSAN associates a metadata byte (also called shadow byte) with every byte of
128*93858ae7SAlexander Potapenkokernel memory. A bit in the shadow byte is set iff the corresponding bit of the
129*93858ae7SAlexander Potapenkokernel memory byte is uninitialized. Marking the memory uninitialized (i.e.
130*93858ae7SAlexander Potapenkosetting its shadow bytes to ``0xff``) is called poisoning, marking it
131*93858ae7SAlexander Potapenkoinitialized (setting the shadow bytes to ``0x00``) is called unpoisoning.
132*93858ae7SAlexander Potapenko
133*93858ae7SAlexander PotapenkoWhen a new variable is allocated on the stack, it is poisoned by default by
134*93858ae7SAlexander Potapenkoinstrumentation code inserted by the compiler (unless it is a stack variable
135*93858ae7SAlexander Potapenkothat is immediately initialized). Any new heap allocation done without
136*93858ae7SAlexander Potapenko``__GFP_ZERO`` is also poisoned.
137*93858ae7SAlexander Potapenko
138*93858ae7SAlexander PotapenkoCompiler instrumentation also tracks the shadow values as they are used along
139*93858ae7SAlexander Potapenkothe code. When needed, instrumentation code invokes the runtime library in
140*93858ae7SAlexander Potapenko``mm/kmsan/`` to persist shadow values.
141*93858ae7SAlexander Potapenko
142*93858ae7SAlexander PotapenkoThe shadow value of a basic or compound type is an array of bytes of the same
143*93858ae7SAlexander Potapenkolength. When a constant value is written into memory, that memory is unpoisoned.
144*93858ae7SAlexander PotapenkoWhen a value is read from memory, its shadow memory is also obtained and
145*93858ae7SAlexander Potapenkopropagated into all the operations which use that value. For every instruction
146*93858ae7SAlexander Potapenkothat takes one or more values the compiler generates code that calculates the
147*93858ae7SAlexander Potapenkoshadow of the result depending on those values and their shadows.
148*93858ae7SAlexander Potapenko
149*93858ae7SAlexander PotapenkoExample::
150*93858ae7SAlexander Potapenko
151*93858ae7SAlexander Potapenko  int a = 0xff;  // i.e. 0x000000ff
152*93858ae7SAlexander Potapenko  int b;
153*93858ae7SAlexander Potapenko  int c = a | b;
154*93858ae7SAlexander Potapenko
155*93858ae7SAlexander PotapenkoIn this case the shadow of ``a`` is ``0``, shadow of ``b`` is ``0xffffffff``,
156*93858ae7SAlexander Potapenkoshadow of ``c`` is ``0xffffff00``. This means that the upper three bytes of
157*93858ae7SAlexander Potapenko``c`` are uninitialized, while the lower byte is initialized.
158*93858ae7SAlexander Potapenko
159*93858ae7SAlexander PotapenkoOrigin tracking
160*93858ae7SAlexander Potapenko---------------
161*93858ae7SAlexander Potapenko
162*93858ae7SAlexander PotapenkoEvery four bytes of kernel memory also have a so-called origin mapped to them.
163*93858ae7SAlexander PotapenkoThis origin describes the point in program execution at which the uninitialized
164*93858ae7SAlexander Potapenkovalue was created. Every origin is associated with either the full allocation
165*93858ae7SAlexander Potapenkostack (for heap-allocated memory), or the function containing the uninitialized
166*93858ae7SAlexander Potapenkovariable (for locals).
167*93858ae7SAlexander Potapenko
168*93858ae7SAlexander PotapenkoWhen an uninitialized variable is allocated on stack or heap, a new origin
169*93858ae7SAlexander Potapenkovalue is created, and that variable's origin is filled with that value. When a
170*93858ae7SAlexander Potapenkovalue is read from memory, its origin is also read and kept together with the
171*93858ae7SAlexander Potapenkoshadow. For every instruction that takes one or more values, the origin of the
172*93858ae7SAlexander Potapenkoresult is one of the origins corresponding to any of the uninitialized inputs.
173*93858ae7SAlexander PotapenkoIf a poisoned value is written into memory, its origin is written to the
174*93858ae7SAlexander Potapenkocorresponding storage as well.
175*93858ae7SAlexander Potapenko
176*93858ae7SAlexander PotapenkoExample 1::
177*93858ae7SAlexander Potapenko
178*93858ae7SAlexander Potapenko  int a = 42;
179*93858ae7SAlexander Potapenko  int b;
180*93858ae7SAlexander Potapenko  int c = a + b;
181*93858ae7SAlexander Potapenko
182*93858ae7SAlexander PotapenkoIn this case the origin of ``b`` is generated upon function entry, and is
183*93858ae7SAlexander Potapenkostored to the origin of ``c`` right before the addition result is written into
184*93858ae7SAlexander Potapenkomemory.
185*93858ae7SAlexander Potapenko
186*93858ae7SAlexander PotapenkoSeveral variables may share the same origin address, if they are stored in the
187*93858ae7SAlexander Potapenkosame four-byte chunk. In this case every write to either variable updates the
188*93858ae7SAlexander Potapenkoorigin for all of them. We have to sacrifice precision in this case, because
189*93858ae7SAlexander Potapenkostoring origins for individual bits (and even bytes) would be too costly.
190*93858ae7SAlexander Potapenko
191*93858ae7SAlexander PotapenkoExample 2::
192*93858ae7SAlexander Potapenko
193*93858ae7SAlexander Potapenko  int combine(short a, short b) {
194*93858ae7SAlexander Potapenko    union ret_t {
195*93858ae7SAlexander Potapenko      int i;
196*93858ae7SAlexander Potapenko      short s[2];
197*93858ae7SAlexander Potapenko    } ret;
198*93858ae7SAlexander Potapenko    ret.s[0] = a;
199*93858ae7SAlexander Potapenko    ret.s[1] = b;
200*93858ae7SAlexander Potapenko    return ret.i;
201*93858ae7SAlexander Potapenko  }
202*93858ae7SAlexander Potapenko
203*93858ae7SAlexander PotapenkoIf ``a`` is initialized and ``b`` is not, the shadow of the result would be
204*93858ae7SAlexander Potapenko0xffff0000, and the origin of the result would be the origin of ``b``.
205*93858ae7SAlexander Potapenko``ret.s[0]`` would have the same origin, but it will never be used, because
206*93858ae7SAlexander Potapenkothat variable is initialized.
207*93858ae7SAlexander Potapenko
208*93858ae7SAlexander PotapenkoIf both function arguments are uninitialized, only the origin of the second
209*93858ae7SAlexander Potapenkoargument is preserved.
210*93858ae7SAlexander Potapenko
211*93858ae7SAlexander PotapenkoOrigin chaining
212*93858ae7SAlexander Potapenko~~~~~~~~~~~~~~~
213*93858ae7SAlexander Potapenko
214*93858ae7SAlexander PotapenkoTo ease debugging, KMSAN creates a new origin for every store of an
215*93858ae7SAlexander Potapenkouninitialized value to memory. The new origin references both its creation stack
216*93858ae7SAlexander Potapenkoand the previous origin the value had. This may cause increased memory
217*93858ae7SAlexander Potapenkoconsumption, so we limit the length of origin chains in the runtime.
218*93858ae7SAlexander Potapenko
219*93858ae7SAlexander PotapenkoClang instrumentation API
220*93858ae7SAlexander Potapenko-------------------------
221*93858ae7SAlexander Potapenko
222*93858ae7SAlexander PotapenkoClang instrumentation pass inserts calls to functions defined in
223*93858ae7SAlexander Potapenko``mm/kmsan/nstrumentation.c`` into the kernel code.
224*93858ae7SAlexander Potapenko
225*93858ae7SAlexander PotapenkoShadow manipulation
226*93858ae7SAlexander Potapenko~~~~~~~~~~~~~~~~~~~
227*93858ae7SAlexander Potapenko
228*93858ae7SAlexander PotapenkoFor every memory access the compiler emits a call to a function that returns a
229*93858ae7SAlexander Potapenkopair of pointers to the shadow and origin addresses of the given memory::
230*93858ae7SAlexander Potapenko
231*93858ae7SAlexander Potapenko  typedef struct {
232*93858ae7SAlexander Potapenko    void *shadow, *origin;
233*93858ae7SAlexander Potapenko  } shadow_origin_ptr_t
234*93858ae7SAlexander Potapenko
235*93858ae7SAlexander Potapenko  shadow_origin_ptr_t __msan_metadata_ptr_for_load_{1,2,4,8}(void *addr)
236*93858ae7SAlexander Potapenko  shadow_origin_ptr_t __msan_metadata_ptr_for_store_{1,2,4,8}(void *addr)
237*93858ae7SAlexander Potapenko  shadow_origin_ptr_t __msan_metadata_ptr_for_load_n(void *addr, uintptr_t size)
238*93858ae7SAlexander Potapenko  shadow_origin_ptr_t __msan_metadata_ptr_for_store_n(void *addr, uintptr_t size)
239*93858ae7SAlexander Potapenko
240*93858ae7SAlexander PotapenkoThe function name depends on the memory access size.
241*93858ae7SAlexander Potapenko
242*93858ae7SAlexander PotapenkoThe compiler makes sure that for every loaded value its shadow and origin
243*93858ae7SAlexander Potapenkovalues are read from memory. When a value is stored to memory, its shadow and
244*93858ae7SAlexander Potapenkoorigin are also stored using the metadata pointers.
245*93858ae7SAlexander Potapenko
246*93858ae7SAlexander PotapenkoHandling locals
247*93858ae7SAlexander Potapenko~~~~~~~~~~~~~~~
248*93858ae7SAlexander Potapenko
249*93858ae7SAlexander PotapenkoA special function is used to create a new origin value for a local variable and
250*93858ae7SAlexander Potapenkoset the origin of that variable to that value::
251*93858ae7SAlexander Potapenko
252*93858ae7SAlexander Potapenko  void __msan_poison_alloca(void *addr, uintptr_t size, char *descr)
253*93858ae7SAlexander Potapenko
254*93858ae7SAlexander PotapenkoAccess to per-task data
255*93858ae7SAlexander Potapenko~~~~~~~~~~~~~~~~~~~~~~~
256*93858ae7SAlexander Potapenko
257*93858ae7SAlexander PotapenkoAt the beginning of every instrumented function KMSAN inserts a call to
258*93858ae7SAlexander Potapenko``__msan_get_context_state()``::
259*93858ae7SAlexander Potapenko
260*93858ae7SAlexander Potapenko  kmsan_context_state *__msan_get_context_state(void)
261*93858ae7SAlexander Potapenko
262*93858ae7SAlexander Potapenko``kmsan_context_state`` is declared in ``include/linux/kmsan.h``::
263*93858ae7SAlexander Potapenko
264*93858ae7SAlexander Potapenko  struct kmsan_context_state {
265*93858ae7SAlexander Potapenko    char param_tls[KMSAN_PARAM_SIZE];
266*93858ae7SAlexander Potapenko    char retval_tls[KMSAN_RETVAL_SIZE];
267*93858ae7SAlexander Potapenko    char va_arg_tls[KMSAN_PARAM_SIZE];
268*93858ae7SAlexander Potapenko    char va_arg_origin_tls[KMSAN_PARAM_SIZE];
269*93858ae7SAlexander Potapenko    u64 va_arg_overflow_size_tls;
270*93858ae7SAlexander Potapenko    char param_origin_tls[KMSAN_PARAM_SIZE];
271*93858ae7SAlexander Potapenko    depot_stack_handle_t retval_origin_tls;
272*93858ae7SAlexander Potapenko  };
273*93858ae7SAlexander Potapenko
274*93858ae7SAlexander PotapenkoThis structure is used by KMSAN to pass parameter shadows and origins between
275*93858ae7SAlexander Potapenkoinstrumented functions (unless the parameters are checked immediately by
276*93858ae7SAlexander Potapenko``CONFIG_KMSAN_CHECK_PARAM_RETVAL``).
277*93858ae7SAlexander Potapenko
278*93858ae7SAlexander PotapenkoPassing uninitialized values to functions
279*93858ae7SAlexander Potapenko~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
280*93858ae7SAlexander Potapenko
281*93858ae7SAlexander PotapenkoClang's MemorySanitizer instrumentation has an option,
282*93858ae7SAlexander Potapenko``-fsanitize-memory-param-retval``, which makes the compiler check function
283*93858ae7SAlexander Potapenkoparameters passed by value, as well as function return values.
284*93858ae7SAlexander Potapenko
285*93858ae7SAlexander PotapenkoThe option is controlled by ``CONFIG_KMSAN_CHECK_PARAM_RETVAL``, which is
286*93858ae7SAlexander Potapenkoenabled by default to let KMSAN report uninitialized values earlier.
287*93858ae7SAlexander PotapenkoPlease refer to the `LKML discussion`_ for more details.
288*93858ae7SAlexander Potapenko
289*93858ae7SAlexander PotapenkoBecause of the way the checks are implemented in LLVM (they are only applied to
290*93858ae7SAlexander Potapenkoparameters marked as ``noundef``), not all parameters are guaranteed to be
291*93858ae7SAlexander Potapenkochecked, so we cannot give up the metadata storage in ``kmsan_context_state``.
292*93858ae7SAlexander Potapenko
293*93858ae7SAlexander PotapenkoString functions
294*93858ae7SAlexander Potapenko~~~~~~~~~~~~~~~~
295*93858ae7SAlexander Potapenko
296*93858ae7SAlexander PotapenkoThe compiler replaces calls to ``memcpy()``/``memmove()``/``memset()`` with the
297*93858ae7SAlexander Potapenkofollowing functions. These functions are also called when data structures are
298*93858ae7SAlexander Potapenkoinitialized or copied, making sure shadow and origin values are copied alongside
299*93858ae7SAlexander Potapenkowith the data::
300*93858ae7SAlexander Potapenko
301*93858ae7SAlexander Potapenko  void *__msan_memcpy(void *dst, void *src, uintptr_t n)
302*93858ae7SAlexander Potapenko  void *__msan_memmove(void *dst, void *src, uintptr_t n)
303*93858ae7SAlexander Potapenko  void *__msan_memset(void *dst, int c, uintptr_t n)
304*93858ae7SAlexander Potapenko
305*93858ae7SAlexander PotapenkoError reporting
306*93858ae7SAlexander Potapenko~~~~~~~~~~~~~~~
307*93858ae7SAlexander Potapenko
308*93858ae7SAlexander PotapenkoFor each use of a value the compiler emits a shadow check that calls
309*93858ae7SAlexander Potapenko``__msan_warning()`` in the case that value is poisoned::
310*93858ae7SAlexander Potapenko
311*93858ae7SAlexander Potapenko  void __msan_warning(u32 origin)
312*93858ae7SAlexander Potapenko
313*93858ae7SAlexander Potapenko``__msan_warning()`` causes KMSAN runtime to print an error report.
314*93858ae7SAlexander Potapenko
315*93858ae7SAlexander PotapenkoInline assembly instrumentation
316*93858ae7SAlexander Potapenko~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
317*93858ae7SAlexander Potapenko
318*93858ae7SAlexander PotapenkoKMSAN instruments every inline assembly output with a call to::
319*93858ae7SAlexander Potapenko
320*93858ae7SAlexander Potapenko  void __msan_instrument_asm_store(void *addr, uintptr_t size)
321*93858ae7SAlexander Potapenko
322*93858ae7SAlexander Potapenko, which unpoisons the memory region.
323*93858ae7SAlexander Potapenko
324*93858ae7SAlexander PotapenkoThis approach may mask certain errors, but it also helps to avoid a lot of
325*93858ae7SAlexander Potapenkofalse positives in bitwise operations, atomics etc.
326*93858ae7SAlexander Potapenko
327*93858ae7SAlexander PotapenkoSometimes the pointers passed into inline assembly do not point to valid memory.
328*93858ae7SAlexander PotapenkoIn such cases they are ignored at runtime.
329*93858ae7SAlexander Potapenko
330*93858ae7SAlexander Potapenko
331*93858ae7SAlexander PotapenkoRuntime library
332*93858ae7SAlexander Potapenko---------------
333*93858ae7SAlexander Potapenko
334*93858ae7SAlexander PotapenkoThe code is located in ``mm/kmsan/``.
335*93858ae7SAlexander Potapenko
336*93858ae7SAlexander PotapenkoPer-task KMSAN state
337*93858ae7SAlexander Potapenko~~~~~~~~~~~~~~~~~~~~
338*93858ae7SAlexander Potapenko
339*93858ae7SAlexander PotapenkoEvery task_struct has an associated KMSAN task state that holds the KMSAN
340*93858ae7SAlexander Potapenkocontext (see above) and a per-task flag disallowing KMSAN reports::
341*93858ae7SAlexander Potapenko
342*93858ae7SAlexander Potapenko  struct kmsan_context {
343*93858ae7SAlexander Potapenko    ...
344*93858ae7SAlexander Potapenko    bool allow_reporting;
345*93858ae7SAlexander Potapenko    struct kmsan_context_state cstate;
346*93858ae7SAlexander Potapenko    ...
347*93858ae7SAlexander Potapenko  }
348*93858ae7SAlexander Potapenko
349*93858ae7SAlexander Potapenko  struct task_struct {
350*93858ae7SAlexander Potapenko    ...
351*93858ae7SAlexander Potapenko    struct kmsan_context kmsan;
352*93858ae7SAlexander Potapenko    ...
353*93858ae7SAlexander Potapenko  }
354*93858ae7SAlexander Potapenko
355*93858ae7SAlexander PotapenkoKMSAN contexts
356*93858ae7SAlexander Potapenko~~~~~~~~~~~~~~
357*93858ae7SAlexander Potapenko
358*93858ae7SAlexander PotapenkoWhen running in a kernel task context, KMSAN uses ``current->kmsan.cstate`` to
359*93858ae7SAlexander Potapenkohold the metadata for function parameters and return values.
360*93858ae7SAlexander Potapenko
361*93858ae7SAlexander PotapenkoBut in the case the kernel is running in the interrupt, softirq or NMI context,
362*93858ae7SAlexander Potapenkowhere ``current`` is unavailable, KMSAN switches to per-cpu interrupt state::
363*93858ae7SAlexander Potapenko
364*93858ae7SAlexander Potapenko  DEFINE_PER_CPU(struct kmsan_ctx, kmsan_percpu_ctx);
365*93858ae7SAlexander Potapenko
366*93858ae7SAlexander PotapenkoMetadata allocation
367*93858ae7SAlexander Potapenko~~~~~~~~~~~~~~~~~~~
368*93858ae7SAlexander Potapenko
369*93858ae7SAlexander PotapenkoThere are several places in the kernel for which the metadata is stored.
370*93858ae7SAlexander Potapenko
371*93858ae7SAlexander Potapenko1. Each ``struct page`` instance contains two pointers to its shadow and
372*93858ae7SAlexander Potapenkoorigin pages::
373*93858ae7SAlexander Potapenko
374*93858ae7SAlexander Potapenko  struct page {
375*93858ae7SAlexander Potapenko    ...
376*93858ae7SAlexander Potapenko    struct page *shadow, *origin;
377*93858ae7SAlexander Potapenko    ...
378*93858ae7SAlexander Potapenko  };
379*93858ae7SAlexander Potapenko
380*93858ae7SAlexander PotapenkoAt boot-time, the kernel allocates shadow and origin pages for every available
381*93858ae7SAlexander Potapenkokernel page. This is done quite late, when the kernel address space is already
382*93858ae7SAlexander Potapenkofragmented, so normal data pages may arbitrarily interleave with the metadata
383*93858ae7SAlexander Potapenkopages.
384*93858ae7SAlexander Potapenko
385*93858ae7SAlexander PotapenkoThis means that in general for two contiguous memory pages their shadow/origin
386*93858ae7SAlexander Potapenkopages may not be contiguous. Consequently, if a memory access crosses the
387*93858ae7SAlexander Potapenkoboundary of a memory block, accesses to shadow/origin memory may potentially
388*93858ae7SAlexander Potapenkocorrupt other pages or read incorrect values from them.
389*93858ae7SAlexander Potapenko
390*93858ae7SAlexander PotapenkoIn practice, contiguous memory pages returned by the same ``alloc_pages()``
391*93858ae7SAlexander Potapenkocall will have contiguous metadata, whereas if these pages belong to two
392*93858ae7SAlexander Potapenkodifferent allocations their metadata pages can be fragmented.
393*93858ae7SAlexander Potapenko
394*93858ae7SAlexander PotapenkoFor the kernel data (``.data``, ``.bss`` etc.) and percpu memory regions
395*93858ae7SAlexander Potapenkothere also are no guarantees on metadata contiguity.
396*93858ae7SAlexander Potapenko
397*93858ae7SAlexander PotapenkoIn the case ``__msan_metadata_ptr_for_XXX_YYY()`` hits the border between two
398*93858ae7SAlexander Potapenkopages with non-contiguous metadata, it returns pointers to fake shadow/origin regions::
399*93858ae7SAlexander Potapenko
400*93858ae7SAlexander Potapenko  char dummy_load_page[PAGE_SIZE] __attribute__((aligned(PAGE_SIZE)));
401*93858ae7SAlexander Potapenko  char dummy_store_page[PAGE_SIZE] __attribute__((aligned(PAGE_SIZE)));
402*93858ae7SAlexander Potapenko
403*93858ae7SAlexander Potapenko``dummy_load_page`` is zero-initialized, so reads from it always yield zeroes.
404*93858ae7SAlexander PotapenkoAll stores to ``dummy_store_page`` are ignored.
405*93858ae7SAlexander Potapenko
406*93858ae7SAlexander Potapenko2. For vmalloc memory and modules, there is a direct mapping between the memory
407*93858ae7SAlexander Potapenkorange, its shadow and origin. KMSAN reduces the vmalloc area by 3/4, making only
408*93858ae7SAlexander Potapenkothe first quarter available to ``vmalloc()``. The second quarter of the vmalloc
409*93858ae7SAlexander Potapenkoarea contains shadow memory for the first quarter, the third one holds the
410*93858ae7SAlexander Potapenkoorigins. A small part of the fourth quarter contains shadow and origins for the
411*93858ae7SAlexander Potapenkokernel modules. Please refer to ``arch/x86/include/asm/pgtable_64_types.h`` for
412*93858ae7SAlexander Potapenkomore details.
413*93858ae7SAlexander Potapenko
414*93858ae7SAlexander PotapenkoWhen an array of pages is mapped into a contiguous virtual memory space, their
415*93858ae7SAlexander Potapenkoshadow and origin pages are similarly mapped into contiguous regions.
416*93858ae7SAlexander Potapenko
417*93858ae7SAlexander PotapenkoReferences
418*93858ae7SAlexander Potapenko==========
419*93858ae7SAlexander Potapenko
420*93858ae7SAlexander PotapenkoE. Stepanov, K. Serebryany. `MemorySanitizer: fast detector of uninitialized
421*93858ae7SAlexander Potapenkomemory use in C++
422*93858ae7SAlexander Potapenko<https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43308.pdf>`_.
423*93858ae7SAlexander PotapenkoIn Proceedings of CGO 2015.
424*93858ae7SAlexander Potapenko
425*93858ae7SAlexander Potapenko.. _MemorySanitizer tool: https://clang.llvm.org/docs/MemorySanitizer.html
426*93858ae7SAlexander Potapenko.. _LLVM documentation: https://llvm.org/docs/GettingStarted.html
427*93858ae7SAlexander Potapenko.. _LKML discussion: https://lore.kernel.org/all/20220614144853.3693273-1-glider@google.com/
428