1*93858ae7SAlexander Potapenko.. SPDX-License-Identifier: GPL-2.0 2*93858ae7SAlexander Potapenko.. Copyright (C) 2022, Google LLC. 3*93858ae7SAlexander Potapenko 4*93858ae7SAlexander Potapenko=================================== 5*93858ae7SAlexander PotapenkoThe Kernel Memory Sanitizer (KMSAN) 6*93858ae7SAlexander Potapenko=================================== 7*93858ae7SAlexander Potapenko 8*93858ae7SAlexander PotapenkoKMSAN is a dynamic error detector aimed at finding uses of uninitialized 9*93858ae7SAlexander Potapenkovalues. It is based on compiler instrumentation, and is quite similar to the 10*93858ae7SAlexander Potapenkouserspace `MemorySanitizer tool`_. 11*93858ae7SAlexander Potapenko 12*93858ae7SAlexander PotapenkoAn important note is that KMSAN is not intended for production use, because it 13*93858ae7SAlexander Potapenkodrastically increases kernel memory footprint and slows the whole system down. 14*93858ae7SAlexander Potapenko 15*93858ae7SAlexander PotapenkoUsage 16*93858ae7SAlexander Potapenko===== 17*93858ae7SAlexander Potapenko 18*93858ae7SAlexander PotapenkoBuilding the kernel 19*93858ae7SAlexander Potapenko------------------- 20*93858ae7SAlexander Potapenko 21*93858ae7SAlexander PotapenkoIn order to build a kernel with KMSAN you will need a fresh Clang (14.0.6+). 22*93858ae7SAlexander PotapenkoPlease refer to `LLVM documentation`_ for the instructions on how to build Clang. 23*93858ae7SAlexander Potapenko 24*93858ae7SAlexander PotapenkoNow configure and build the kernel with CONFIG_KMSAN enabled. 25*93858ae7SAlexander Potapenko 26*93858ae7SAlexander PotapenkoExample report 27*93858ae7SAlexander Potapenko-------------- 28*93858ae7SAlexander Potapenko 29*93858ae7SAlexander PotapenkoHere is an example of a KMSAN report:: 30*93858ae7SAlexander Potapenko 31*93858ae7SAlexander Potapenko ===================================================== 32*93858ae7SAlexander Potapenko BUG: KMSAN: uninit-value in test_uninit_kmsan_check_memory+0x1be/0x380 [kmsan_test] 33*93858ae7SAlexander Potapenko test_uninit_kmsan_check_memory+0x1be/0x380 mm/kmsan/kmsan_test.c:273 34*93858ae7SAlexander Potapenko kunit_run_case_internal lib/kunit/test.c:333 35*93858ae7SAlexander Potapenko kunit_try_run_case+0x206/0x420 lib/kunit/test.c:374 36*93858ae7SAlexander Potapenko kunit_generic_run_threadfn_adapter+0x6d/0xc0 lib/kunit/try-catch.c:28 37*93858ae7SAlexander Potapenko kthread+0x721/0x850 kernel/kthread.c:327 38*93858ae7SAlexander Potapenko ret_from_fork+0x1f/0x30 ??:? 39*93858ae7SAlexander Potapenko 40*93858ae7SAlexander Potapenko Uninit was stored to memory at: 41*93858ae7SAlexander Potapenko do_uninit_local_array+0xfa/0x110 mm/kmsan/kmsan_test.c:260 42*93858ae7SAlexander Potapenko test_uninit_kmsan_check_memory+0x1a2/0x380 mm/kmsan/kmsan_test.c:271 43*93858ae7SAlexander Potapenko kunit_run_case_internal lib/kunit/test.c:333 44*93858ae7SAlexander Potapenko kunit_try_run_case+0x206/0x420 lib/kunit/test.c:374 45*93858ae7SAlexander Potapenko kunit_generic_run_threadfn_adapter+0x6d/0xc0 lib/kunit/try-catch.c:28 46*93858ae7SAlexander Potapenko kthread+0x721/0x850 kernel/kthread.c:327 47*93858ae7SAlexander Potapenko ret_from_fork+0x1f/0x30 ??:? 48*93858ae7SAlexander Potapenko 49*93858ae7SAlexander Potapenko Local variable uninit created at: 50*93858ae7SAlexander Potapenko do_uninit_local_array+0x4a/0x110 mm/kmsan/kmsan_test.c:256 51*93858ae7SAlexander Potapenko test_uninit_kmsan_check_memory+0x1a2/0x380 mm/kmsan/kmsan_test.c:271 52*93858ae7SAlexander Potapenko 53*93858ae7SAlexander Potapenko Bytes 4-7 of 8 are uninitialized 54*93858ae7SAlexander Potapenko Memory access of size 8 starts at ffff888083fe3da0 55*93858ae7SAlexander Potapenko 56*93858ae7SAlexander Potapenko CPU: 0 PID: 6731 Comm: kunit_try_catch Tainted: G B E 5.16.0-rc3+ #104 57*93858ae7SAlexander Potapenko Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 58*93858ae7SAlexander Potapenko ===================================================== 59*93858ae7SAlexander Potapenko 60*93858ae7SAlexander PotapenkoThe report says that the local variable ``uninit`` was created uninitialized in 61*93858ae7SAlexander Potapenko``do_uninit_local_array()``. The third stack trace corresponds to the place 62*93858ae7SAlexander Potapenkowhere this variable was created. 63*93858ae7SAlexander Potapenko 64*93858ae7SAlexander PotapenkoThe first stack trace shows where the uninit value was used (in 65*93858ae7SAlexander Potapenko``test_uninit_kmsan_check_memory()``). The tool shows the bytes which were left 66*93858ae7SAlexander Potapenkouninitialized in the local variable, as well as the stack where the value was 67*93858ae7SAlexander Potapenkocopied to another memory location before use. 68*93858ae7SAlexander Potapenko 69*93858ae7SAlexander PotapenkoA use of uninitialized value ``v`` is reported by KMSAN in the following cases: 70*93858ae7SAlexander Potapenko - in a condition, e.g. ``if (v) { ... }``; 71*93858ae7SAlexander Potapenko - in an indexing or pointer dereferencing, e.g. ``array[v]`` or ``*v``; 72*93858ae7SAlexander Potapenko - when it is copied to userspace or hardware, e.g. ``copy_to_user(..., &v, ...)``; 73*93858ae7SAlexander Potapenko - when it is passed as an argument to a function, and 74*93858ae7SAlexander Potapenko ``CONFIG_KMSAN_CHECK_PARAM_RETVAL`` is enabled (see below). 75*93858ae7SAlexander Potapenko 76*93858ae7SAlexander PotapenkoThe mentioned cases (apart from copying data to userspace or hardware, which is 77*93858ae7SAlexander Potapenkoa security issue) are considered undefined behavior from the C11 Standard point 78*93858ae7SAlexander Potapenkoof view. 79*93858ae7SAlexander Potapenko 80*93858ae7SAlexander PotapenkoDisabling the instrumentation 81*93858ae7SAlexander Potapenko----------------------------- 82*93858ae7SAlexander Potapenko 83*93858ae7SAlexander PotapenkoA function can be marked with ``__no_kmsan_checks``. Doing so makes KMSAN 84*93858ae7SAlexander Potapenkoignore uninitialized values in that function and mark its output as initialized. 85*93858ae7SAlexander PotapenkoAs a result, the user will not get KMSAN reports related to that function. 86*93858ae7SAlexander Potapenko 87*93858ae7SAlexander PotapenkoAnother function attribute supported by KMSAN is ``__no_sanitize_memory``. 88*93858ae7SAlexander PotapenkoApplying this attribute to a function will result in KMSAN not instrumenting 89*93858ae7SAlexander Potapenkoit, which can be helpful if we do not want the compiler to interfere with some 90*93858ae7SAlexander Potapenkolow-level code (e.g. that marked with ``noinstr`` which implicitly adds 91*93858ae7SAlexander Potapenko``__no_sanitize_memory``). 92*93858ae7SAlexander Potapenko 93*93858ae7SAlexander PotapenkoThis however comes at a cost: stack allocations from such functions will have 94*93858ae7SAlexander Potapenkoincorrect shadow/origin values, likely leading to false positives. Functions 95*93858ae7SAlexander Potapenkocalled from non-instrumented code may also receive incorrect metadata for their 96*93858ae7SAlexander Potapenkoparameters. 97*93858ae7SAlexander Potapenko 98*93858ae7SAlexander PotapenkoAs a rule of thumb, avoid using ``__no_sanitize_memory`` explicitly. 99*93858ae7SAlexander Potapenko 100*93858ae7SAlexander PotapenkoIt is also possible to disable KMSAN for a single file (e.g. main.o):: 101*93858ae7SAlexander Potapenko 102*93858ae7SAlexander Potapenko KMSAN_SANITIZE_main.o := n 103*93858ae7SAlexander Potapenko 104*93858ae7SAlexander Potapenkoor for the whole directory:: 105*93858ae7SAlexander Potapenko 106*93858ae7SAlexander Potapenko KMSAN_SANITIZE := n 107*93858ae7SAlexander Potapenko 108*93858ae7SAlexander Potapenkoin the Makefile. Think of this as applying ``__no_sanitize_memory`` to every 109*93858ae7SAlexander Potapenkofunction in the file or directory. Most users won't need KMSAN_SANITIZE, unless 110*93858ae7SAlexander Potapenkotheir code gets broken by KMSAN (e.g. runs at early boot time). 111*93858ae7SAlexander Potapenko 112*93858ae7SAlexander PotapenkoSupport 113*93858ae7SAlexander Potapenko======= 114*93858ae7SAlexander Potapenko 115*93858ae7SAlexander PotapenkoIn order for KMSAN to work the kernel must be built with Clang, which so far is 116*93858ae7SAlexander Potapenkothe only compiler that has KMSAN support. The kernel instrumentation pass is 117*93858ae7SAlexander Potapenkobased on the userspace `MemorySanitizer tool`_. 118*93858ae7SAlexander Potapenko 119*93858ae7SAlexander PotapenkoThe runtime library only supports x86_64 at the moment. 120*93858ae7SAlexander Potapenko 121*93858ae7SAlexander PotapenkoHow KMSAN works 122*93858ae7SAlexander Potapenko=============== 123*93858ae7SAlexander Potapenko 124*93858ae7SAlexander PotapenkoKMSAN shadow memory 125*93858ae7SAlexander Potapenko------------------- 126*93858ae7SAlexander Potapenko 127*93858ae7SAlexander PotapenkoKMSAN associates a metadata byte (also called shadow byte) with every byte of 128*93858ae7SAlexander Potapenkokernel memory. A bit in the shadow byte is set iff the corresponding bit of the 129*93858ae7SAlexander Potapenkokernel memory byte is uninitialized. Marking the memory uninitialized (i.e. 130*93858ae7SAlexander Potapenkosetting its shadow bytes to ``0xff``) is called poisoning, marking it 131*93858ae7SAlexander Potapenkoinitialized (setting the shadow bytes to ``0x00``) is called unpoisoning. 132*93858ae7SAlexander Potapenko 133*93858ae7SAlexander PotapenkoWhen a new variable is allocated on the stack, it is poisoned by default by 134*93858ae7SAlexander Potapenkoinstrumentation code inserted by the compiler (unless it is a stack variable 135*93858ae7SAlexander Potapenkothat is immediately initialized). Any new heap allocation done without 136*93858ae7SAlexander Potapenko``__GFP_ZERO`` is also poisoned. 137*93858ae7SAlexander Potapenko 138*93858ae7SAlexander PotapenkoCompiler instrumentation also tracks the shadow values as they are used along 139*93858ae7SAlexander Potapenkothe code. When needed, instrumentation code invokes the runtime library in 140*93858ae7SAlexander Potapenko``mm/kmsan/`` to persist shadow values. 141*93858ae7SAlexander Potapenko 142*93858ae7SAlexander PotapenkoThe shadow value of a basic or compound type is an array of bytes of the same 143*93858ae7SAlexander Potapenkolength. When a constant value is written into memory, that memory is unpoisoned. 144*93858ae7SAlexander PotapenkoWhen a value is read from memory, its shadow memory is also obtained and 145*93858ae7SAlexander Potapenkopropagated into all the operations which use that value. For every instruction 146*93858ae7SAlexander Potapenkothat takes one or more values the compiler generates code that calculates the 147*93858ae7SAlexander Potapenkoshadow of the result depending on those values and their shadows. 148*93858ae7SAlexander Potapenko 149*93858ae7SAlexander PotapenkoExample:: 150*93858ae7SAlexander Potapenko 151*93858ae7SAlexander Potapenko int a = 0xff; // i.e. 0x000000ff 152*93858ae7SAlexander Potapenko int b; 153*93858ae7SAlexander Potapenko int c = a | b; 154*93858ae7SAlexander Potapenko 155*93858ae7SAlexander PotapenkoIn this case the shadow of ``a`` is ``0``, shadow of ``b`` is ``0xffffffff``, 156*93858ae7SAlexander Potapenkoshadow of ``c`` is ``0xffffff00``. This means that the upper three bytes of 157*93858ae7SAlexander Potapenko``c`` are uninitialized, while the lower byte is initialized. 158*93858ae7SAlexander Potapenko 159*93858ae7SAlexander PotapenkoOrigin tracking 160*93858ae7SAlexander Potapenko--------------- 161*93858ae7SAlexander Potapenko 162*93858ae7SAlexander PotapenkoEvery four bytes of kernel memory also have a so-called origin mapped to them. 163*93858ae7SAlexander PotapenkoThis origin describes the point in program execution at which the uninitialized 164*93858ae7SAlexander Potapenkovalue was created. Every origin is associated with either the full allocation 165*93858ae7SAlexander Potapenkostack (for heap-allocated memory), or the function containing the uninitialized 166*93858ae7SAlexander Potapenkovariable (for locals). 167*93858ae7SAlexander Potapenko 168*93858ae7SAlexander PotapenkoWhen an uninitialized variable is allocated on stack or heap, a new origin 169*93858ae7SAlexander Potapenkovalue is created, and that variable's origin is filled with that value. When a 170*93858ae7SAlexander Potapenkovalue is read from memory, its origin is also read and kept together with the 171*93858ae7SAlexander Potapenkoshadow. For every instruction that takes one or more values, the origin of the 172*93858ae7SAlexander Potapenkoresult is one of the origins corresponding to any of the uninitialized inputs. 173*93858ae7SAlexander PotapenkoIf a poisoned value is written into memory, its origin is written to the 174*93858ae7SAlexander Potapenkocorresponding storage as well. 175*93858ae7SAlexander Potapenko 176*93858ae7SAlexander PotapenkoExample 1:: 177*93858ae7SAlexander Potapenko 178*93858ae7SAlexander Potapenko int a = 42; 179*93858ae7SAlexander Potapenko int b; 180*93858ae7SAlexander Potapenko int c = a + b; 181*93858ae7SAlexander Potapenko 182*93858ae7SAlexander PotapenkoIn this case the origin of ``b`` is generated upon function entry, and is 183*93858ae7SAlexander Potapenkostored to the origin of ``c`` right before the addition result is written into 184*93858ae7SAlexander Potapenkomemory. 185*93858ae7SAlexander Potapenko 186*93858ae7SAlexander PotapenkoSeveral variables may share the same origin address, if they are stored in the 187*93858ae7SAlexander Potapenkosame four-byte chunk. In this case every write to either variable updates the 188*93858ae7SAlexander Potapenkoorigin for all of them. We have to sacrifice precision in this case, because 189*93858ae7SAlexander Potapenkostoring origins for individual bits (and even bytes) would be too costly. 190*93858ae7SAlexander Potapenko 191*93858ae7SAlexander PotapenkoExample 2:: 192*93858ae7SAlexander Potapenko 193*93858ae7SAlexander Potapenko int combine(short a, short b) { 194*93858ae7SAlexander Potapenko union ret_t { 195*93858ae7SAlexander Potapenko int i; 196*93858ae7SAlexander Potapenko short s[2]; 197*93858ae7SAlexander Potapenko } ret; 198*93858ae7SAlexander Potapenko ret.s[0] = a; 199*93858ae7SAlexander Potapenko ret.s[1] = b; 200*93858ae7SAlexander Potapenko return ret.i; 201*93858ae7SAlexander Potapenko } 202*93858ae7SAlexander Potapenko 203*93858ae7SAlexander PotapenkoIf ``a`` is initialized and ``b`` is not, the shadow of the result would be 204*93858ae7SAlexander Potapenko0xffff0000, and the origin of the result would be the origin of ``b``. 205*93858ae7SAlexander Potapenko``ret.s[0]`` would have the same origin, but it will never be used, because 206*93858ae7SAlexander Potapenkothat variable is initialized. 207*93858ae7SAlexander Potapenko 208*93858ae7SAlexander PotapenkoIf both function arguments are uninitialized, only the origin of the second 209*93858ae7SAlexander Potapenkoargument is preserved. 210*93858ae7SAlexander Potapenko 211*93858ae7SAlexander PotapenkoOrigin chaining 212*93858ae7SAlexander Potapenko~~~~~~~~~~~~~~~ 213*93858ae7SAlexander Potapenko 214*93858ae7SAlexander PotapenkoTo ease debugging, KMSAN creates a new origin for every store of an 215*93858ae7SAlexander Potapenkouninitialized value to memory. The new origin references both its creation stack 216*93858ae7SAlexander Potapenkoand the previous origin the value had. This may cause increased memory 217*93858ae7SAlexander Potapenkoconsumption, so we limit the length of origin chains in the runtime. 218*93858ae7SAlexander Potapenko 219*93858ae7SAlexander PotapenkoClang instrumentation API 220*93858ae7SAlexander Potapenko------------------------- 221*93858ae7SAlexander Potapenko 222*93858ae7SAlexander PotapenkoClang instrumentation pass inserts calls to functions defined in 223*93858ae7SAlexander Potapenko``mm/kmsan/nstrumentation.c`` into the kernel code. 224*93858ae7SAlexander Potapenko 225*93858ae7SAlexander PotapenkoShadow manipulation 226*93858ae7SAlexander Potapenko~~~~~~~~~~~~~~~~~~~ 227*93858ae7SAlexander Potapenko 228*93858ae7SAlexander PotapenkoFor every memory access the compiler emits a call to a function that returns a 229*93858ae7SAlexander Potapenkopair of pointers to the shadow and origin addresses of the given memory:: 230*93858ae7SAlexander Potapenko 231*93858ae7SAlexander Potapenko typedef struct { 232*93858ae7SAlexander Potapenko void *shadow, *origin; 233*93858ae7SAlexander Potapenko } shadow_origin_ptr_t 234*93858ae7SAlexander Potapenko 235*93858ae7SAlexander Potapenko shadow_origin_ptr_t __msan_metadata_ptr_for_load_{1,2,4,8}(void *addr) 236*93858ae7SAlexander Potapenko shadow_origin_ptr_t __msan_metadata_ptr_for_store_{1,2,4,8}(void *addr) 237*93858ae7SAlexander Potapenko shadow_origin_ptr_t __msan_metadata_ptr_for_load_n(void *addr, uintptr_t size) 238*93858ae7SAlexander Potapenko shadow_origin_ptr_t __msan_metadata_ptr_for_store_n(void *addr, uintptr_t size) 239*93858ae7SAlexander Potapenko 240*93858ae7SAlexander PotapenkoThe function name depends on the memory access size. 241*93858ae7SAlexander Potapenko 242*93858ae7SAlexander PotapenkoThe compiler makes sure that for every loaded value its shadow and origin 243*93858ae7SAlexander Potapenkovalues are read from memory. When a value is stored to memory, its shadow and 244*93858ae7SAlexander Potapenkoorigin are also stored using the metadata pointers. 245*93858ae7SAlexander Potapenko 246*93858ae7SAlexander PotapenkoHandling locals 247*93858ae7SAlexander Potapenko~~~~~~~~~~~~~~~ 248*93858ae7SAlexander Potapenko 249*93858ae7SAlexander PotapenkoA special function is used to create a new origin value for a local variable and 250*93858ae7SAlexander Potapenkoset the origin of that variable to that value:: 251*93858ae7SAlexander Potapenko 252*93858ae7SAlexander Potapenko void __msan_poison_alloca(void *addr, uintptr_t size, char *descr) 253*93858ae7SAlexander Potapenko 254*93858ae7SAlexander PotapenkoAccess to per-task data 255*93858ae7SAlexander Potapenko~~~~~~~~~~~~~~~~~~~~~~~ 256*93858ae7SAlexander Potapenko 257*93858ae7SAlexander PotapenkoAt the beginning of every instrumented function KMSAN inserts a call to 258*93858ae7SAlexander Potapenko``__msan_get_context_state()``:: 259*93858ae7SAlexander Potapenko 260*93858ae7SAlexander Potapenko kmsan_context_state *__msan_get_context_state(void) 261*93858ae7SAlexander Potapenko 262*93858ae7SAlexander Potapenko``kmsan_context_state`` is declared in ``include/linux/kmsan.h``:: 263*93858ae7SAlexander Potapenko 264*93858ae7SAlexander Potapenko struct kmsan_context_state { 265*93858ae7SAlexander Potapenko char param_tls[KMSAN_PARAM_SIZE]; 266*93858ae7SAlexander Potapenko char retval_tls[KMSAN_RETVAL_SIZE]; 267*93858ae7SAlexander Potapenko char va_arg_tls[KMSAN_PARAM_SIZE]; 268*93858ae7SAlexander Potapenko char va_arg_origin_tls[KMSAN_PARAM_SIZE]; 269*93858ae7SAlexander Potapenko u64 va_arg_overflow_size_tls; 270*93858ae7SAlexander Potapenko char param_origin_tls[KMSAN_PARAM_SIZE]; 271*93858ae7SAlexander Potapenko depot_stack_handle_t retval_origin_tls; 272*93858ae7SAlexander Potapenko }; 273*93858ae7SAlexander Potapenko 274*93858ae7SAlexander PotapenkoThis structure is used by KMSAN to pass parameter shadows and origins between 275*93858ae7SAlexander Potapenkoinstrumented functions (unless the parameters are checked immediately by 276*93858ae7SAlexander Potapenko``CONFIG_KMSAN_CHECK_PARAM_RETVAL``). 277*93858ae7SAlexander Potapenko 278*93858ae7SAlexander PotapenkoPassing uninitialized values to functions 279*93858ae7SAlexander Potapenko~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 280*93858ae7SAlexander Potapenko 281*93858ae7SAlexander PotapenkoClang's MemorySanitizer instrumentation has an option, 282*93858ae7SAlexander Potapenko``-fsanitize-memory-param-retval``, which makes the compiler check function 283*93858ae7SAlexander Potapenkoparameters passed by value, as well as function return values. 284*93858ae7SAlexander Potapenko 285*93858ae7SAlexander PotapenkoThe option is controlled by ``CONFIG_KMSAN_CHECK_PARAM_RETVAL``, which is 286*93858ae7SAlexander Potapenkoenabled by default to let KMSAN report uninitialized values earlier. 287*93858ae7SAlexander PotapenkoPlease refer to the `LKML discussion`_ for more details. 288*93858ae7SAlexander Potapenko 289*93858ae7SAlexander PotapenkoBecause of the way the checks are implemented in LLVM (they are only applied to 290*93858ae7SAlexander Potapenkoparameters marked as ``noundef``), not all parameters are guaranteed to be 291*93858ae7SAlexander Potapenkochecked, so we cannot give up the metadata storage in ``kmsan_context_state``. 292*93858ae7SAlexander Potapenko 293*93858ae7SAlexander PotapenkoString functions 294*93858ae7SAlexander Potapenko~~~~~~~~~~~~~~~~ 295*93858ae7SAlexander Potapenko 296*93858ae7SAlexander PotapenkoThe compiler replaces calls to ``memcpy()``/``memmove()``/``memset()`` with the 297*93858ae7SAlexander Potapenkofollowing functions. These functions are also called when data structures are 298*93858ae7SAlexander Potapenkoinitialized or copied, making sure shadow and origin values are copied alongside 299*93858ae7SAlexander Potapenkowith the data:: 300*93858ae7SAlexander Potapenko 301*93858ae7SAlexander Potapenko void *__msan_memcpy(void *dst, void *src, uintptr_t n) 302*93858ae7SAlexander Potapenko void *__msan_memmove(void *dst, void *src, uintptr_t n) 303*93858ae7SAlexander Potapenko void *__msan_memset(void *dst, int c, uintptr_t n) 304*93858ae7SAlexander Potapenko 305*93858ae7SAlexander PotapenkoError reporting 306*93858ae7SAlexander Potapenko~~~~~~~~~~~~~~~ 307*93858ae7SAlexander Potapenko 308*93858ae7SAlexander PotapenkoFor each use of a value the compiler emits a shadow check that calls 309*93858ae7SAlexander Potapenko``__msan_warning()`` in the case that value is poisoned:: 310*93858ae7SAlexander Potapenko 311*93858ae7SAlexander Potapenko void __msan_warning(u32 origin) 312*93858ae7SAlexander Potapenko 313*93858ae7SAlexander Potapenko``__msan_warning()`` causes KMSAN runtime to print an error report. 314*93858ae7SAlexander Potapenko 315*93858ae7SAlexander PotapenkoInline assembly instrumentation 316*93858ae7SAlexander Potapenko~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 317*93858ae7SAlexander Potapenko 318*93858ae7SAlexander PotapenkoKMSAN instruments every inline assembly output with a call to:: 319*93858ae7SAlexander Potapenko 320*93858ae7SAlexander Potapenko void __msan_instrument_asm_store(void *addr, uintptr_t size) 321*93858ae7SAlexander Potapenko 322*93858ae7SAlexander Potapenko, which unpoisons the memory region. 323*93858ae7SAlexander Potapenko 324*93858ae7SAlexander PotapenkoThis approach may mask certain errors, but it also helps to avoid a lot of 325*93858ae7SAlexander Potapenkofalse positives in bitwise operations, atomics etc. 326*93858ae7SAlexander Potapenko 327*93858ae7SAlexander PotapenkoSometimes the pointers passed into inline assembly do not point to valid memory. 328*93858ae7SAlexander PotapenkoIn such cases they are ignored at runtime. 329*93858ae7SAlexander Potapenko 330*93858ae7SAlexander Potapenko 331*93858ae7SAlexander PotapenkoRuntime library 332*93858ae7SAlexander Potapenko--------------- 333*93858ae7SAlexander Potapenko 334*93858ae7SAlexander PotapenkoThe code is located in ``mm/kmsan/``. 335*93858ae7SAlexander Potapenko 336*93858ae7SAlexander PotapenkoPer-task KMSAN state 337*93858ae7SAlexander Potapenko~~~~~~~~~~~~~~~~~~~~ 338*93858ae7SAlexander Potapenko 339*93858ae7SAlexander PotapenkoEvery task_struct has an associated KMSAN task state that holds the KMSAN 340*93858ae7SAlexander Potapenkocontext (see above) and a per-task flag disallowing KMSAN reports:: 341*93858ae7SAlexander Potapenko 342*93858ae7SAlexander Potapenko struct kmsan_context { 343*93858ae7SAlexander Potapenko ... 344*93858ae7SAlexander Potapenko bool allow_reporting; 345*93858ae7SAlexander Potapenko struct kmsan_context_state cstate; 346*93858ae7SAlexander Potapenko ... 347*93858ae7SAlexander Potapenko } 348*93858ae7SAlexander Potapenko 349*93858ae7SAlexander Potapenko struct task_struct { 350*93858ae7SAlexander Potapenko ... 351*93858ae7SAlexander Potapenko struct kmsan_context kmsan; 352*93858ae7SAlexander Potapenko ... 353*93858ae7SAlexander Potapenko } 354*93858ae7SAlexander Potapenko 355*93858ae7SAlexander PotapenkoKMSAN contexts 356*93858ae7SAlexander Potapenko~~~~~~~~~~~~~~ 357*93858ae7SAlexander Potapenko 358*93858ae7SAlexander PotapenkoWhen running in a kernel task context, KMSAN uses ``current->kmsan.cstate`` to 359*93858ae7SAlexander Potapenkohold the metadata for function parameters and return values. 360*93858ae7SAlexander Potapenko 361*93858ae7SAlexander PotapenkoBut in the case the kernel is running in the interrupt, softirq or NMI context, 362*93858ae7SAlexander Potapenkowhere ``current`` is unavailable, KMSAN switches to per-cpu interrupt state:: 363*93858ae7SAlexander Potapenko 364*93858ae7SAlexander Potapenko DEFINE_PER_CPU(struct kmsan_ctx, kmsan_percpu_ctx); 365*93858ae7SAlexander Potapenko 366*93858ae7SAlexander PotapenkoMetadata allocation 367*93858ae7SAlexander Potapenko~~~~~~~~~~~~~~~~~~~ 368*93858ae7SAlexander Potapenko 369*93858ae7SAlexander PotapenkoThere are several places in the kernel for which the metadata is stored. 370*93858ae7SAlexander Potapenko 371*93858ae7SAlexander Potapenko1. Each ``struct page`` instance contains two pointers to its shadow and 372*93858ae7SAlexander Potapenkoorigin pages:: 373*93858ae7SAlexander Potapenko 374*93858ae7SAlexander Potapenko struct page { 375*93858ae7SAlexander Potapenko ... 376*93858ae7SAlexander Potapenko struct page *shadow, *origin; 377*93858ae7SAlexander Potapenko ... 378*93858ae7SAlexander Potapenko }; 379*93858ae7SAlexander Potapenko 380*93858ae7SAlexander PotapenkoAt boot-time, the kernel allocates shadow and origin pages for every available 381*93858ae7SAlexander Potapenkokernel page. This is done quite late, when the kernel address space is already 382*93858ae7SAlexander Potapenkofragmented, so normal data pages may arbitrarily interleave with the metadata 383*93858ae7SAlexander Potapenkopages. 384*93858ae7SAlexander Potapenko 385*93858ae7SAlexander PotapenkoThis means that in general for two contiguous memory pages their shadow/origin 386*93858ae7SAlexander Potapenkopages may not be contiguous. Consequently, if a memory access crosses the 387*93858ae7SAlexander Potapenkoboundary of a memory block, accesses to shadow/origin memory may potentially 388*93858ae7SAlexander Potapenkocorrupt other pages or read incorrect values from them. 389*93858ae7SAlexander Potapenko 390*93858ae7SAlexander PotapenkoIn practice, contiguous memory pages returned by the same ``alloc_pages()`` 391*93858ae7SAlexander Potapenkocall will have contiguous metadata, whereas if these pages belong to two 392*93858ae7SAlexander Potapenkodifferent allocations their metadata pages can be fragmented. 393*93858ae7SAlexander Potapenko 394*93858ae7SAlexander PotapenkoFor the kernel data (``.data``, ``.bss`` etc.) and percpu memory regions 395*93858ae7SAlexander Potapenkothere also are no guarantees on metadata contiguity. 396*93858ae7SAlexander Potapenko 397*93858ae7SAlexander PotapenkoIn the case ``__msan_metadata_ptr_for_XXX_YYY()`` hits the border between two 398*93858ae7SAlexander Potapenkopages with non-contiguous metadata, it returns pointers to fake shadow/origin regions:: 399*93858ae7SAlexander Potapenko 400*93858ae7SAlexander Potapenko char dummy_load_page[PAGE_SIZE] __attribute__((aligned(PAGE_SIZE))); 401*93858ae7SAlexander Potapenko char dummy_store_page[PAGE_SIZE] __attribute__((aligned(PAGE_SIZE))); 402*93858ae7SAlexander Potapenko 403*93858ae7SAlexander Potapenko``dummy_load_page`` is zero-initialized, so reads from it always yield zeroes. 404*93858ae7SAlexander PotapenkoAll stores to ``dummy_store_page`` are ignored. 405*93858ae7SAlexander Potapenko 406*93858ae7SAlexander Potapenko2. For vmalloc memory and modules, there is a direct mapping between the memory 407*93858ae7SAlexander Potapenkorange, its shadow and origin. KMSAN reduces the vmalloc area by 3/4, making only 408*93858ae7SAlexander Potapenkothe first quarter available to ``vmalloc()``. The second quarter of the vmalloc 409*93858ae7SAlexander Potapenkoarea contains shadow memory for the first quarter, the third one holds the 410*93858ae7SAlexander Potapenkoorigins. A small part of the fourth quarter contains shadow and origins for the 411*93858ae7SAlexander Potapenkokernel modules. Please refer to ``arch/x86/include/asm/pgtable_64_types.h`` for 412*93858ae7SAlexander Potapenkomore details. 413*93858ae7SAlexander Potapenko 414*93858ae7SAlexander PotapenkoWhen an array of pages is mapped into a contiguous virtual memory space, their 415*93858ae7SAlexander Potapenkoshadow and origin pages are similarly mapped into contiguous regions. 416*93858ae7SAlexander Potapenko 417*93858ae7SAlexander PotapenkoReferences 418*93858ae7SAlexander Potapenko========== 419*93858ae7SAlexander Potapenko 420*93858ae7SAlexander PotapenkoE. Stepanov, K. Serebryany. `MemorySanitizer: fast detector of uninitialized 421*93858ae7SAlexander Potapenkomemory use in C++ 422*93858ae7SAlexander Potapenko<https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43308.pdf>`_. 423*93858ae7SAlexander PotapenkoIn Proceedings of CGO 2015. 424*93858ae7SAlexander Potapenko 425*93858ae7SAlexander Potapenko.. _MemorySanitizer tool: https://clang.llvm.org/docs/MemorySanitizer.html 426*93858ae7SAlexander Potapenko.. _LLVM documentation: https://llvm.org/docs/GettingStarted.html 427*93858ae7SAlexander Potapenko.. _LKML discussion: https://lore.kernel.org/all/20220614144853.3693273-1-glider@google.com/ 428