1.. _pagemap: 2 3============================= 4Examining Process Page Tables 5============================= 6 7pagemap is a new (as of 2.6.25) set of interfaces in the kernel that allow 8userspace programs to examine the page tables and related information by 9reading files in ``/proc``. 10 11There are four components to pagemap: 12 13 * ``/proc/pid/pagemap``. This file lets a userspace process find out which 14 physical frame each virtual page is mapped to. It contains one 64-bit 15 value for each virtual page, containing the following data (from 16 ``fs/proc/task_mmu.c``, above pagemap_read): 17 18 * Bits 0-54 page frame number (PFN) if present 19 * Bits 0-4 swap type if swapped 20 * Bits 5-54 swap offset if swapped 21 * Bit 55 pte is soft-dirty (see Documentation/admin-guide/mm/soft-dirty.rst) 22 * Bit 56 page exclusively mapped (since 4.2) 23 * Bits 57-60 zero 24 * Bit 61 page is file-page or shared-anon (since 3.5) 25 * Bit 62 page swapped 26 * Bit 63 page present 27 28 Since Linux 4.0 only users with the CAP_SYS_ADMIN capability can get PFNs. 29 In 4.0 and 4.1 opens by unprivileged fail with -EPERM. Starting from 30 4.2 the PFN field is zeroed if the user does not have CAP_SYS_ADMIN. 31 Reason: information about PFNs helps in exploiting Rowhammer vulnerability. 32 33 If the page is not present but in swap, then the PFN contains an 34 encoding of the swap file number and the page's offset into the 35 swap. Unmapped pages return a null PFN. This allows determining 36 precisely which pages are mapped (or in swap) and comparing mapped 37 pages between processes. 38 39 Efficient users of this interface will use ``/proc/pid/maps`` to 40 determine which areas of memory are actually mapped and llseek to 41 skip over unmapped regions. 42 43 * ``/proc/kpagecount``. This file contains a 64-bit count of the number of 44 times each page is mapped, indexed by PFN. 45 46 * ``/proc/kpageflags``. This file contains a 64-bit set of flags for each 47 page, indexed by PFN. 48 49 The flags are (from ``fs/proc/page.c``, above kpageflags_read): 50 51 0. LOCKED 52 1. ERROR 53 2. REFERENCED 54 3. UPTODATE 55 4. DIRTY 56 5. LRU 57 6. ACTIVE 58 7. SLAB 59 8. WRITEBACK 60 9. RECLAIM 61 10. BUDDY 62 11. MMAP 63 12. ANON 64 13. SWAPCACHE 65 14. SWAPBACKED 66 15. COMPOUND_HEAD 67 16. COMPOUND_TAIL 68 17. HUGE 69 18. UNEVICTABLE 70 19. HWPOISON 71 20. NOPAGE 72 21. KSM 73 22. THP 74 23. BALLOON 75 24. ZERO_PAGE 76 25. IDLE 77 78 * ``/proc/kpagecgroup``. This file contains a 64-bit inode number of the 79 memory cgroup each page is charged to, indexed by PFN. Only available when 80 CONFIG_MEMCG is set. 81 82Short descriptions to the page flags 83==================================== 84 850 - LOCKED 86 page is being locked for exclusive access, e.g. by undergoing read/write IO 877 - SLAB 88 page is managed by the SLAB/SLOB/SLUB/SLQB kernel memory allocator 89 When compound page is used, SLUB/SLQB will only set this flag on the head 90 page; SLOB will not flag it at all. 9110 - BUDDY 92 a free memory block managed by the buddy system allocator 93 The buddy system organizes free memory in blocks of various orders. 94 An order N block has 2^N physically contiguous pages, with the BUDDY flag 95 set for and _only_ for the first page. 9615 - COMPOUND_HEAD 97 A compound page with order N consists of 2^N physically contiguous pages. 98 A compound page with order 2 takes the form of "HTTT", where H donates its 99 head page and T donates its tail page(s). The major consumers of compound 100 pages are hugeTLB pages (Documentation/admin-guide/mm/hugetlbpage.rst), the SLUB etc. 101 memory allocators and various device drivers. However in this interface, 102 only huge/giga pages are made visible to end users. 10316 - COMPOUND_TAIL 104 A compound page tail (see description above). 10517 - HUGE 106 this is an integral part of a HugeTLB page 10719 - HWPOISON 108 hardware detected memory corruption on this page: don't touch the data! 10920 - NOPAGE 110 no page frame exists at the requested address 11121 - KSM 112 identical memory pages dynamically shared between one or more processes 11322 - THP 114 contiguous pages which construct transparent hugepages 11523 - BALLOON 116 balloon compaction page 11724 - ZERO_PAGE 118 zero page for pfn_zero or huge_zero page 11925 - IDLE 120 page has not been accessed since it was marked idle (see 121 Documentation/admin-guide/mm/idle_page_tracking.rst). Note that this flag may be 122 stale in case the page was accessed via a PTE. To make sure the flag 123 is up-to-date one has to read ``/sys/kernel/mm/page_idle/bitmap`` first. 124 125IO related page flags 126--------------------- 127 1281 - ERROR 129 IO error occurred 1303 - UPTODATE 131 page has up-to-date data 132 ie. for file backed page: (in-memory data revision >= on-disk one) 1334 - DIRTY 134 page has been written to, hence contains new data 135 i.e. for file backed page: (in-memory data revision > on-disk one) 1368 - WRITEBACK 137 page is being synced to disk 138 139LRU related page flags 140---------------------- 141 1425 - LRU 143 page is in one of the LRU lists 1446 - ACTIVE 145 page is in the active LRU list 14618 - UNEVICTABLE 147 page is in the unevictable (non-)LRU list It is somehow pinned and 148 not a candidate for LRU page reclaims, e.g. ramfs pages, 149 shmctl(SHM_LOCK) and mlock() memory segments 1502 - REFERENCED 151 page has been referenced since last LRU list enqueue/requeue 1529 - RECLAIM 153 page will be reclaimed soon after its pageout IO completed 15411 - MMAP 155 a memory mapped page 15612 - ANON 157 a memory mapped page that is not part of a file 15813 - SWAPCACHE 159 page is mapped to swap space, i.e. has an associated swap entry 16014 - SWAPBACKED 161 page is backed by swap/RAM 162 163The page-types tool in the tools/vm directory can be used to query the 164above flags. 165 166Using pagemap to do something useful 167==================================== 168 169The general procedure for using pagemap to find out about a process' memory 170usage goes like this: 171 172 1. Read ``/proc/pid/maps`` to determine which parts of the memory space are 173 mapped to what. 174 2. Select the maps you are interested in -- all of them, or a particular 175 library, or the stack or the heap, etc. 176 3. Open ``/proc/pid/pagemap`` and seek to the pages you would like to examine. 177 4. Read a u64 for each page from pagemap. 178 5. Open ``/proc/kpagecount`` and/or ``/proc/kpageflags``. For each PFN you 179 just read, seek to that entry in the file, and read the data you want. 180 181For example, to find the "unique set size" (USS), which is the amount of 182memory that a process is using that is not shared with any other process, 183you can go through every map in the process, find the PFNs, look those up 184in kpagecount, and tally up the number of pages that are only referenced 185once. 186 187Other notes 188=========== 189 190Reading from any of the files will return -EINVAL if you are not starting 191the read on an 8-byte boundary (e.g., if you sought an odd number of bytes 192into the file), or if the size of the read is not a multiple of 8 bytes. 193 194Before Linux 3.11 pagemap bits 55-60 were used for "page-shift" (which is 195always 12 at most architectures). Since Linux 3.11 their meaning changes 196after first clear of soft-dirty bits. Since Linux 4.2 they are used for 197flags unconditionally. 198