1ca90a7a3SJonathan CorbetKernel Memory Leak Detector 2ca90a7a3SJonathan Corbet=========================== 3ca90a7a3SJonathan Corbet 4ca90a7a3SJonathan CorbetKmemleak provides a way of detecting possible kernel memory leaks in a 5b7c3613eSAndré Almeidaway similar to a `tracing garbage collector 6b7c3613eSAndré Almeida<https://en.wikipedia.org/wiki/Tracing_garbage_collection>`_, 7ca90a7a3SJonathan Corbetwith the difference that the orphan objects are not freed but only 8ca90a7a3SJonathan Corbetreported via /sys/kernel/debug/kmemleak. A similar method is used by the 9ca90a7a3SJonathan CorbetValgrind tool (``memcheck --leak-check``) to detect the memory leaks in 10ca90a7a3SJonathan Corbetuser-space applications. 11ca90a7a3SJonathan Corbet 12ca90a7a3SJonathan CorbetUsage 13ca90a7a3SJonathan Corbet----- 14ca90a7a3SJonathan Corbet 15ca90a7a3SJonathan CorbetCONFIG_DEBUG_KMEMLEAK in "Kernel hacking" has to be enabled. A kernel 16ca90a7a3SJonathan Corbetthread scans the memory every 10 minutes (by default) and prints the 17b7c3613eSAndré Almeidanumber of new unreferenced objects found. If the ``debugfs`` isn't already 18b7c3613eSAndré Almeidamounted, mount with:: 19ca90a7a3SJonathan Corbet 20ca90a7a3SJonathan Corbet # mount -t debugfs nodev /sys/kernel/debug/ 21b7c3613eSAndré Almeida 22b7c3613eSAndré AlmeidaTo display the details of all the possible scanned memory leaks:: 23b7c3613eSAndré Almeida 24ca90a7a3SJonathan Corbet # cat /sys/kernel/debug/kmemleak 25ca90a7a3SJonathan Corbet 26ca90a7a3SJonathan CorbetTo trigger an intermediate memory scan:: 27ca90a7a3SJonathan Corbet 28ca90a7a3SJonathan Corbet # echo scan > /sys/kernel/debug/kmemleak 29ca90a7a3SJonathan Corbet 30ca90a7a3SJonathan CorbetTo clear the list of all current possible memory leaks:: 31ca90a7a3SJonathan Corbet 32ca90a7a3SJonathan Corbet # echo clear > /sys/kernel/debug/kmemleak 33ca90a7a3SJonathan Corbet 34ca90a7a3SJonathan CorbetNew leaks will then come up upon reading ``/sys/kernel/debug/kmemleak`` 35ca90a7a3SJonathan Corbetagain. 36ca90a7a3SJonathan Corbet 37ca90a7a3SJonathan CorbetNote that the orphan objects are listed in the order they were allocated 38ca90a7a3SJonathan Corbetand one object at the beginning of the list may cause other subsequent 39ca90a7a3SJonathan Corbetobjects to be reported as orphan. 40ca90a7a3SJonathan Corbet 41ca90a7a3SJonathan CorbetMemory scanning parameters can be modified at run-time by writing to the 42ca90a7a3SJonathan Corbet``/sys/kernel/debug/kmemleak`` file. The following parameters are supported: 43ca90a7a3SJonathan Corbet 44ca90a7a3SJonathan Corbet- off 45ca90a7a3SJonathan Corbet disable kmemleak (irreversible) 46ca90a7a3SJonathan Corbet- stack=on 47ca90a7a3SJonathan Corbet enable the task stacks scanning (default) 48ca90a7a3SJonathan Corbet- stack=off 49ca90a7a3SJonathan Corbet disable the tasks stacks scanning 50ca90a7a3SJonathan Corbet- scan=on 51ca90a7a3SJonathan Corbet start the automatic memory scanning thread (default) 52ca90a7a3SJonathan Corbet- scan=off 53ca90a7a3SJonathan Corbet stop the automatic memory scanning thread 54ca90a7a3SJonathan Corbet- scan=<secs> 55ca90a7a3SJonathan Corbet set the automatic memory scanning period in seconds 56ca90a7a3SJonathan Corbet (default 600, 0 to stop the automatic scanning) 57ca90a7a3SJonathan Corbet- scan 58ca90a7a3SJonathan Corbet trigger a memory scan 59ca90a7a3SJonathan Corbet- clear 60ca90a7a3SJonathan Corbet clear list of current memory leak suspects, done by 61ca90a7a3SJonathan Corbet marking all current reported unreferenced objects grey, 62ca90a7a3SJonathan Corbet or free all kmemleak objects if kmemleak has been disabled. 63ca90a7a3SJonathan Corbet- dump=<addr> 64ca90a7a3SJonathan Corbet dump information about the object found at <addr> 65ca90a7a3SJonathan Corbet 66ca90a7a3SJonathan CorbetKmemleak can also be disabled at boot-time by passing ``kmemleak=off`` on 67ca90a7a3SJonathan Corbetthe kernel command line. 68ca90a7a3SJonathan Corbet 69ca90a7a3SJonathan CorbetMemory may be allocated or freed before kmemleak is initialised and 70ca90a7a3SJonathan Corbetthese actions are stored in an early log buffer. The size of this buffer 712c861bf5SJeremy Clineis configured via the CONFIG_DEBUG_KMEMLEAK_MEM_POOL_SIZE option. 72ca90a7a3SJonathan Corbet 73ca90a7a3SJonathan CorbetIf CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF are enabled, the kmemleak is 74ca90a7a3SJonathan Corbetdisabled by default. Passing ``kmemleak=on`` on the kernel command 75ca90a7a3SJonathan Corbetline enables the function. 76ca90a7a3SJonathan Corbet 77b7c3613eSAndré AlmeidaIf you are getting errors like "Error while writing to stdout" or "write_loop: 78b7c3613eSAndré AlmeidaInvalid argument", make sure kmemleak is properly enabled. 79b7c3613eSAndré Almeida 80ca90a7a3SJonathan CorbetBasic Algorithm 81ca90a7a3SJonathan Corbet--------------- 82ca90a7a3SJonathan Corbet 83ca90a7a3SJonathan CorbetThe memory allocations via :c:func:`kmalloc`, :c:func:`vmalloc`, 84ca90a7a3SJonathan Corbet:c:func:`kmem_cache_alloc` and 85ca90a7a3SJonathan Corbetfriends are traced and the pointers, together with additional 86ca90a7a3SJonathan Corbetinformation like size and stack trace, are stored in a rbtree. 87ca90a7a3SJonathan CorbetThe corresponding freeing function calls are tracked and the pointers 88ca90a7a3SJonathan Corbetremoved from the kmemleak data structures. 89ca90a7a3SJonathan Corbet 90ca90a7a3SJonathan CorbetAn allocated block of memory is considered orphan if no pointer to its 91ca90a7a3SJonathan Corbetstart address or to any location inside the block can be found by 92ca90a7a3SJonathan Corbetscanning the memory (including saved registers). This means that there 93ca90a7a3SJonathan Corbetmight be no way for the kernel to pass the address of the allocated 94ca90a7a3SJonathan Corbetblock to a freeing function and therefore the block is considered a 95ca90a7a3SJonathan Corbetmemory leak. 96ca90a7a3SJonathan Corbet 97ca90a7a3SJonathan CorbetThe scanning algorithm steps: 98ca90a7a3SJonathan Corbet 99ca90a7a3SJonathan Corbet 1. mark all objects as white (remaining white objects will later be 100ca90a7a3SJonathan Corbet considered orphan) 101ca90a7a3SJonathan Corbet 2. scan the memory starting with the data section and stacks, checking 102ca90a7a3SJonathan Corbet the values against the addresses stored in the rbtree. If 103ca90a7a3SJonathan Corbet a pointer to a white object is found, the object is added to the 104ca90a7a3SJonathan Corbet gray list 105ca90a7a3SJonathan Corbet 3. scan the gray objects for matching addresses (some white objects 106ca90a7a3SJonathan Corbet can become gray and added at the end of the gray list) until the 107ca90a7a3SJonathan Corbet gray set is finished 108ca90a7a3SJonathan Corbet 4. the remaining white objects are considered orphan and reported via 109ca90a7a3SJonathan Corbet /sys/kernel/debug/kmemleak 110ca90a7a3SJonathan Corbet 111ca90a7a3SJonathan CorbetSome allocated memory blocks have pointers stored in the kernel's 112ca90a7a3SJonathan Corbetinternal data structures and they cannot be detected as orphans. To 113ca90a7a3SJonathan Corbetavoid this, kmemleak can also store the number of values pointing to an 114ca90a7a3SJonathan Corbetaddress inside the block address range that need to be found so that the 115ca90a7a3SJonathan Corbetblock is not considered a leak. One example is __vmalloc(). 116ca90a7a3SJonathan Corbet 117ca90a7a3SJonathan CorbetTesting specific sections with kmemleak 118ca90a7a3SJonathan Corbet--------------------------------------- 119ca90a7a3SJonathan Corbet 120ca90a7a3SJonathan CorbetUpon initial bootup your /sys/kernel/debug/kmemleak output page may be 121ca90a7a3SJonathan Corbetquite extensive. This can also be the case if you have very buggy code 122ca90a7a3SJonathan Corbetwhen doing development. To work around these situations you can use the 123ca90a7a3SJonathan Corbet'clear' command to clear all reported unreferenced objects from the 124ca90a7a3SJonathan Corbet/sys/kernel/debug/kmemleak output. By issuing a 'scan' after a 'clear' 125ca90a7a3SJonathan Corbetyou can find new unreferenced objects; this should help with testing 126ca90a7a3SJonathan Corbetspecific sections of code. 127ca90a7a3SJonathan Corbet 128ca90a7a3SJonathan CorbetTo test a critical section on demand with a clean kmemleak do:: 129ca90a7a3SJonathan Corbet 130ca90a7a3SJonathan Corbet # echo clear > /sys/kernel/debug/kmemleak 131ca90a7a3SJonathan Corbet ... test your kernel or modules ... 132ca90a7a3SJonathan Corbet # echo scan > /sys/kernel/debug/kmemleak 133ca90a7a3SJonathan Corbet 134ca90a7a3SJonathan CorbetThen as usual to get your report with:: 135ca90a7a3SJonathan Corbet 136ca90a7a3SJonathan Corbet # cat /sys/kernel/debug/kmemleak 137ca90a7a3SJonathan Corbet 138ca90a7a3SJonathan CorbetFreeing kmemleak internal objects 139ca90a7a3SJonathan Corbet--------------------------------- 140ca90a7a3SJonathan Corbet 141ca90a7a3SJonathan CorbetTo allow access to previously found memory leaks after kmemleak has been 142ca90a7a3SJonathan Corbetdisabled by the user or due to an fatal error, internal kmemleak objects 143ca90a7a3SJonathan Corbetwon't be freed when kmemleak is disabled, and those objects may occupy 144ca90a7a3SJonathan Corbeta large part of physical memory. 145ca90a7a3SJonathan Corbet 146ca90a7a3SJonathan CorbetIn this situation, you may reclaim memory with:: 147ca90a7a3SJonathan Corbet 148ca90a7a3SJonathan Corbet # echo clear > /sys/kernel/debug/kmemleak 149ca90a7a3SJonathan Corbet 150ca90a7a3SJonathan CorbetKmemleak API 151ca90a7a3SJonathan Corbet------------ 152ca90a7a3SJonathan Corbet 153ca90a7a3SJonathan CorbetSee the include/linux/kmemleak.h header for the functions prototype. 154ca90a7a3SJonathan Corbet 155ca90a7a3SJonathan Corbet- ``kmemleak_init`` - initialize kmemleak 156ca90a7a3SJonathan Corbet- ``kmemleak_alloc`` - notify of a memory block allocation 157ca90a7a3SJonathan Corbet- ``kmemleak_alloc_percpu`` - notify of a percpu memory block allocation 15894f4a161SCatalin Marinas- ``kmemleak_vmalloc`` - notify of a vmalloc() memory allocation 159ca90a7a3SJonathan Corbet- ``kmemleak_free`` - notify of a memory block freeing 160ca90a7a3SJonathan Corbet- ``kmemleak_free_part`` - notify of a partial memory block freeing 161ca90a7a3SJonathan Corbet- ``kmemleak_free_percpu`` - notify of a percpu memory block freeing 162ca90a7a3SJonathan Corbet- ``kmemleak_update_trace`` - update object allocation stack trace 163ca90a7a3SJonathan Corbet- ``kmemleak_not_leak`` - mark an object as not a leak 164ca90a7a3SJonathan Corbet- ``kmemleak_ignore`` - do not scan or report an object as leak 165ca90a7a3SJonathan Corbet- ``kmemleak_scan_area`` - add scan areas inside a memory block 166ca90a7a3SJonathan Corbet- ``kmemleak_no_scan`` - do not scan a memory block 167ca90a7a3SJonathan Corbet- ``kmemleak_erase`` - erase an old value in a pointer variable 168ca90a7a3SJonathan Corbet- ``kmemleak_alloc_recursive`` - as kmemleak_alloc but checks the recursiveness 169ca90a7a3SJonathan Corbet- ``kmemleak_free_recursive`` - as kmemleak_free but checks the recursiveness 170ca90a7a3SJonathan Corbet 1719099daedSCatalin MarinasThe following functions take a physical address as the object pointer 1729099daedSCatalin Marinasand only perform the corresponding action if the address has a lowmem 1739099daedSCatalin Marinasmapping: 1749099daedSCatalin Marinas 1759099daedSCatalin Marinas- ``kmemleak_alloc_phys`` 1769099daedSCatalin Marinas- ``kmemleak_free_part_phys`` 1779099daedSCatalin Marinas- ``kmemleak_ignore_phys`` 1789099daedSCatalin Marinas 179ca90a7a3SJonathan CorbetDealing with false positives/negatives 180ca90a7a3SJonathan Corbet-------------------------------------- 181ca90a7a3SJonathan Corbet 182ca90a7a3SJonathan CorbetThe false negatives are real memory leaks (orphan objects) but not 183ca90a7a3SJonathan Corbetreported by kmemleak because values found during the memory scanning 184ca90a7a3SJonathan Corbetpoint to such objects. To reduce the number of false negatives, kmemleak 185ca90a7a3SJonathan Corbetprovides the kmemleak_ignore, kmemleak_scan_area, kmemleak_no_scan and 186ca90a7a3SJonathan Corbetkmemleak_erase functions (see above). The task stacks also increase the 187ca90a7a3SJonathan Corbetamount of false negatives and their scanning is not enabled by default. 188ca90a7a3SJonathan Corbet 189ca90a7a3SJonathan CorbetThe false positives are objects wrongly reported as being memory leaks 190ca90a7a3SJonathan Corbet(orphan). For objects known not to be leaks, kmemleak provides the 191ca90a7a3SJonathan Corbetkmemleak_not_leak function. The kmemleak_ignore could also be used if 192ca90a7a3SJonathan Corbetthe memory block is known not to contain other pointers and it will no 193ca90a7a3SJonathan Corbetlonger be scanned. 194ca90a7a3SJonathan Corbet 195ca90a7a3SJonathan CorbetSome of the reported leaks are only transient, especially on SMP 196ca90a7a3SJonathan Corbetsystems, because of pointers temporarily stored in CPU registers or 197ca90a7a3SJonathan Corbetstacks. Kmemleak defines MSECS_MIN_AGE (defaulting to 1000) representing 198ca90a7a3SJonathan Corbetthe minimum age of an object to be reported as a memory leak. 199ca90a7a3SJonathan Corbet 200ca90a7a3SJonathan CorbetLimitations and Drawbacks 201ca90a7a3SJonathan Corbet------------------------- 202ca90a7a3SJonathan Corbet 203ca90a7a3SJonathan CorbetThe main drawback is the reduced performance of memory allocation and 204ca90a7a3SJonathan Corbetfreeing. To avoid other penalties, the memory scanning is only performed 205ca90a7a3SJonathan Corbetwhen the /sys/kernel/debug/kmemleak file is read. Anyway, this tool is 206ca90a7a3SJonathan Corbetintended for debugging purposes where the performance might not be the 207ca90a7a3SJonathan Corbetmost important requirement. 208ca90a7a3SJonathan Corbet 209ca90a7a3SJonathan CorbetTo keep the algorithm simple, kmemleak scans for values pointing to any 210ca90a7a3SJonathan Corbetaddress inside a block's address range. This may lead to an increased 211ca90a7a3SJonathan Corbetnumber of false negatives. However, it is likely that a real memory leak 212ca90a7a3SJonathan Corbetwill eventually become visible. 213ca90a7a3SJonathan Corbet 214ca90a7a3SJonathan CorbetAnother source of false negatives is the data stored in non-pointer 215ca90a7a3SJonathan Corbetvalues. In a future version, kmemleak could only scan the pointer 216ca90a7a3SJonathan Corbetmembers in the allocated structures. This feature would solve many of 217ca90a7a3SJonathan Corbetthe false negative cases described above. 218ca90a7a3SJonathan Corbet 219ca90a7a3SJonathan CorbetThe tool can report false positives. These are cases where an allocated 220ca90a7a3SJonathan Corbetblock doesn't need to be freed (some cases in the init_call functions), 221ca90a7a3SJonathan Corbetthe pointer is calculated by other methods than the usual container_of 222ca90a7a3SJonathan Corbetmacro or the pointer is stored in a location not scanned by kmemleak. 223ca90a7a3SJonathan Corbet 224ca90a7a3SJonathan CorbetPage allocations and ioremap are not tracked. 225b7c3613eSAndré Almeida 226b7c3613eSAndré AlmeidaTesting with kmemleak-test 227b7c3613eSAndré Almeida-------------------------- 228b7c3613eSAndré Almeida 229b7c3613eSAndré AlmeidaTo check if you have all set up to use kmemleak, you can use the kmemleak-test 230*a110f8ebSLukas Bulwahnmodule, a module that deliberately leaks memory. Set CONFIG_SAMPLE_KMEMLEAK 2311abbef4fSHui Suas module (it can't be used as built-in) and boot the kernel with kmemleak 232b7c3613eSAndré Almeidaenabled. Load the module and perform a scan with:: 233b7c3613eSAndré Almeida 234b7c3613eSAndré Almeida # modprobe kmemleak-test 235b7c3613eSAndré Almeida # echo scan > /sys/kernel/debug/kmemleak 236b7c3613eSAndré Almeida 237b7c3613eSAndré AlmeidaNote that the you may not get results instantly or on the first scanning. When 238b7c3613eSAndré Almeidakmemleak gets results, it'll log ``kmemleak: <count of leaks> new suspected 239b7c3613eSAndré Almeidamemory leaks``. Then read the file to see then:: 240b7c3613eSAndré Almeida 241b7c3613eSAndré Almeida # cat /sys/kernel/debug/kmemleak 242b7c3613eSAndré Almeida unreferenced object 0xffff89862ca702e8 (size 32): 243b7c3613eSAndré Almeida comm "modprobe", pid 2088, jiffies 4294680594 (age 375.486s) 244b7c3613eSAndré Almeida hex dump (first 32 bytes): 245b7c3613eSAndré Almeida 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk 246b7c3613eSAndré Almeida 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5 kkkkkkkkkkkkkkk. 247b7c3613eSAndré Almeida backtrace: 248b7c3613eSAndré Almeida [<00000000e0a73ec7>] 0xffffffffc01d2036 249b7c3613eSAndré Almeida [<000000000c5d2a46>] do_one_initcall+0x41/0x1df 250b7c3613eSAndré Almeida [<0000000046db7e0a>] do_init_module+0x55/0x200 251b7c3613eSAndré Almeida [<00000000542b9814>] load_module+0x203c/0x2480 252b7c3613eSAndré Almeida [<00000000c2850256>] __do_sys_finit_module+0xba/0xe0 253b7c3613eSAndré Almeida [<000000006564e7ef>] do_syscall_64+0x43/0x110 254b7c3613eSAndré Almeida [<000000007c873fa6>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 255b7c3613eSAndré Almeida ... 256b7c3613eSAndré Almeida 257b7c3613eSAndré AlmeidaRemoving the module with ``rmmod kmemleak_test`` should also trigger some 258b7c3613eSAndré Almeidakmemleak results. 259