xref: /openbmc/linux/Documentation/admin-guide/kdump/vmcoreinfo.rst (revision 1a931707ad4a46e79d4ecfee56d8f6e8cc8d4f28)
1 ==========
2 VMCOREINFO
3 ==========
4 
5 What is it?
6 ===========
7 
8 VMCOREINFO is a special ELF note section. It contains various
9 information from the kernel like structure size, page size, symbol
10 values, field offsets, etc. These data are packed into an ELF note
11 section and used by user-space tools like crash and makedumpfile to
12 analyze a kernel's memory layout.
13 
14 Common variables
15 ================
16 
17 init_uts_ns.name.release
18 ------------------------
19 
20 The version of the Linux kernel. Used to find the corresponding source
21 code from which the kernel has been built. For example, crash uses it to
22 find the corresponding vmlinux in order to process vmcore.
23 
24 PAGE_SIZE
25 ---------
26 
27 The size of a page. It is the smallest unit of data used by the memory
28 management facilities. It is usually 4096 bytes of size and a page is
29 aligned on 4096 bytes. Used for computing page addresses.
30 
31 init_uts_ns
32 -----------
33 
34 The UTS namespace which is used to isolate two specific elements of the
35 system that relate to the uname(2) system call. It is named after the
36 data structure used to store information returned by the uname(2) system
37 call.
38 
39 User-space tools can get the kernel name, host name, kernel release
40 number, kernel version, architecture name and OS type from it.
41 
42 (uts_namespace, name)
43 ---------------------
44 
45 Offset of the name's member. Crash Utility and Makedumpfile get
46 the start address of the init_uts_ns.name from this.
47 
48 node_online_map
49 ---------------
50 
51 An array node_states[N_ONLINE] which represents the set of online nodes
52 in a system, one bit position per node number. Used to keep track of
53 which nodes are in the system and online.
54 
55 swapper_pg_dir
56 --------------
57 
58 The global page directory pointer of the kernel. Used to translate
59 virtual to physical addresses.
60 
61 _stext
62 ------
63 
64 Defines the beginning of the text section. In general, _stext indicates
65 the kernel start address. Used to convert a virtual address from the
66 direct kernel map to a physical address.
67 
68 vmap_area_list
69 --------------
70 
71 Stores the virtual area list. makedumpfile gets the vmalloc start value
72 from this variable and its value is necessary for vmalloc translation.
73 
74 mem_map
75 -------
76 
77 Physical addresses are translated to struct pages by treating them as
78 an index into the mem_map array. Right-shifting a physical address
79 PAGE_SHIFT bits converts it into a page frame number which is an index
80 into that mem_map array.
81 
82 Used to map an address to the corresponding struct page.
83 
84 contig_page_data
85 ----------------
86 
87 Makedumpfile gets the pglist_data structure from this symbol, which is
88 used to describe the memory layout.
89 
90 User-space tools use this to exclude free pages when dumping memory.
91 
92 mem_section|(mem_section, NR_SECTION_ROOTS)|(mem_section, section_mem_map)
93 --------------------------------------------------------------------------
94 
95 The address of the mem_section array, its length, structure size, and
96 the section_mem_map offset.
97 
98 It exists in the sparse memory mapping model, and it is also somewhat
99 similar to the mem_map variable, both of them are used to translate an
100 address.
101 
102 MAX_PHYSMEM_BITS
103 ----------------
104 
105 Defines the maximum supported physical address space memory.
106 
107 page
108 ----
109 
110 The size of a page structure. struct page is an important data structure
111 and it is widely used to compute contiguous memory.
112 
113 pglist_data
114 -----------
115 
116 The size of a pglist_data structure. This value is used to check if the
117 pglist_data structure is valid. It is also used for checking the memory
118 type.
119 
120 zone
121 ----
122 
123 The size of a zone structure. This value is used to check if the zone
124 structure has been found. It is also used for excluding free pages.
125 
126 free_area
127 ---------
128 
129 The size of a free_area structure. It indicates whether the free_area
130 structure is valid or not. Useful when excluding free pages.
131 
132 list_head
133 ---------
134 
135 The size of a list_head structure. Used when iterating lists in a
136 post-mortem analysis session.
137 
138 nodemask_t
139 ----------
140 
141 The size of a nodemask_t type. Used to compute the number of online
142 nodes.
143 
144 (page, flags|_refcount|mapping|lru|_mapcount|private|compound_order|compound_head)
145 ----------------------------------------------------------------------------------
146 
147 User-space tools compute their values based on the offset of these
148 variables. The variables are used when excluding unnecessary pages.
149 
150 (pglist_data, node_zones|nr_zones|node_mem_map|node_start_pfn|node_spanned_pages|node_id)
151 -----------------------------------------------------------------------------------------
152 
153 On NUMA machines, each NUMA node has a pg_data_t to describe its memory
154 layout. On UMA machines there is a single pglist_data which describes the
155 whole memory.
156 
157 These values are used to check the memory type and to compute the
158 virtual address for memory map.
159 
160 (zone, free_area|vm_stat|spanned_pages)
161 ---------------------------------------
162 
163 Each node is divided into a number of blocks called zones which
164 represent ranges within memory. A zone is described by a structure zone.
165 
166 User-space tools compute required values based on the offset of these
167 variables.
168 
169 (free_area, free_list)
170 ----------------------
171 
172 Offset of the free_list's member. This value is used to compute the number
173 of free pages.
174 
175 Each zone has a free_area structure array called free_area[NR_PAGE_ORDERS].
176 The free_list represents a linked list of free page blocks.
177 
178 (list_head, next|prev)
179 ----------------------
180 
181 Offsets of the list_head's members. list_head is used to define a
182 circular linked list. User-space tools need these in order to traverse
183 lists.
184 
185 (vmap_area, va_start|list)
186 --------------------------
187 
188 Offsets of the vmap_area's members. They carry vmalloc-specific
189 information. Makedumpfile gets the start address of the vmalloc region
190 from this.
191 
192 (zone.free_area, NR_PAGE_ORDERS)
193 --------------------------------
194 
195 Free areas descriptor. User-space tools use this value to iterate the
196 free_area ranges. MAX_ORDER is used by the zone buddy allocator.
197 
198 prb
199 ---
200 
201 A pointer to the printk ringbuffer (struct printk_ringbuffer). This
202 may be pointing to the static boot ringbuffer or the dynamically
203 allocated ringbuffer, depending on when the core dump occurred.
204 Used by user-space tools to read the active kernel log buffer.
205 
206 printk_rb_static
207 ----------------
208 
209 A pointer to the static boot printk ringbuffer. If @prb has a
210 different value, this is useful for viewing the initial boot messages,
211 which may have been overwritten in the dynamically allocated
212 ringbuffer.
213 
214 clear_seq
215 ---------
216 
217 The sequence number of the printk() record after the last clear
218 command. It indicates the first record after the last
219 SYSLOG_ACTION_CLEAR, like issued by 'dmesg -c'. Used by user-space
220 tools to dump a subset of the dmesg log.
221 
222 printk_ringbuffer
223 -----------------
224 
225 The size of a printk_ringbuffer structure. This structure contains all
226 information required for accessing the various components of the
227 kernel log buffer.
228 
229 (printk_ringbuffer, desc_ring|text_data_ring|dict_data_ring|fail)
230 -----------------------------------------------------------------
231 
232 Offsets for the various components of the printk ringbuffer. Used by
233 user-space tools to view the kernel log buffer without requiring the
234 declaration of the structure.
235 
236 prb_desc_ring
237 -------------
238 
239 The size of the prb_desc_ring structure. This structure contains
240 information about the set of record descriptors.
241 
242 (prb_desc_ring, count_bits|descs|head_id|tail_id)
243 -------------------------------------------------
244 
245 Offsets for the fields describing the set of record descriptors. Used
246 by user-space tools to be able to traverse the descriptors without
247 requiring the declaration of the structure.
248 
249 prb_desc
250 --------
251 
252 The size of the prb_desc structure. This structure contains
253 information about a single record descriptor.
254 
255 (prb_desc, info|state_var|text_blk_lpos|dict_blk_lpos)
256 ------------------------------------------------------
257 
258 Offsets for the fields describing a record descriptors. Used by
259 user-space tools to be able to read descriptors without requiring
260 the declaration of the structure.
261 
262 prb_data_blk_lpos
263 -----------------
264 
265 The size of the prb_data_blk_lpos structure. This structure contains
266 information about where the text or dictionary data (data block) is
267 located within the respective data ring.
268 
269 (prb_data_blk_lpos, begin|next)
270 -------------------------------
271 
272 Offsets for the fields describing the location of a data block. Used
273 by user-space tools to be able to locate data blocks without
274 requiring the declaration of the structure.
275 
276 printk_info
277 -----------
278 
279 The size of the printk_info structure. This structure contains all
280 the meta-data for a record.
281 
282 (printk_info, seq|ts_nsec|text_len|dict_len|caller_id)
283 ------------------------------------------------------
284 
285 Offsets for the fields providing the meta-data for a record. Used by
286 user-space tools to be able to read the information without requiring
287 the declaration of the structure.
288 
289 prb_data_ring
290 -------------
291 
292 The size of the prb_data_ring structure. This structure contains
293 information about a set of data blocks.
294 
295 (prb_data_ring, size_bits|data|head_lpos|tail_lpos)
296 ---------------------------------------------------
297 
298 Offsets for the fields describing a set of data blocks. Used by
299 user-space tools to be able to access the data blocks without
300 requiring the declaration of the structure.
301 
302 atomic_long_t
303 -------------
304 
305 The size of the atomic_long_t structure. Used by user-space tools to
306 be able to copy the full structure, regardless of its
307 architecture-specific implementation.
308 
309 (atomic_long_t, counter)
310 ------------------------
311 
312 Offset for the long value of an atomic_long_t variable. Used by
313 user-space tools to access the long value without requiring the
314 architecture-specific declaration.
315 
316 (free_area.free_list, MIGRATE_TYPES)
317 ------------------------------------
318 
319 The number of migrate types for pages. The free_list is described by the
320 array. Used by tools to compute the number of free pages.
321 
322 NR_FREE_PAGES
323 -------------
324 
325 On linux-2.6.21 or later, the number of free pages is in
326 vm_stat[NR_FREE_PAGES]. Used to get the number of free pages.
327 
328 PG_lru|PG_private|PG_swapcache|PG_swapbacked|PG_slab|PG_hwpoision|PG_head_mask|PG_hugetlb
329 -----------------------------------------------------------------------------------------
330 
331 Page attributes. These flags are used to filter various unnecessary for
332 dumping pages.
333 
334 PAGE_BUDDY_MAPCOUNT_VALUE(~PG_buddy)|PAGE_OFFLINE_MAPCOUNT_VALUE(~PG_offline)
335 -----------------------------------------------------------------------------
336 
337 More page attributes. These flags are used to filter various unnecessary for
338 dumping pages.
339 
340 
341 x86_64
342 ======
343 
344 phys_base
345 ---------
346 
347 Used to convert the virtual address of an exported kernel symbol to its
348 corresponding physical address.
349 
350 init_top_pgt
351 ------------
352 
353 Used to walk through the whole page table and convert virtual addresses
354 to physical addresses. The init_top_pgt is somewhat similar to
355 swapper_pg_dir, but it is only used in x86_64.
356 
357 pgtable_l5_enabled
358 ------------------
359 
360 User-space tools need to know whether the crash kernel was in 5-level
361 paging mode.
362 
363 node_data
364 ---------
365 
366 This is a struct pglist_data array and stores all NUMA nodes
367 information. Makedumpfile gets the pglist_data structure from it.
368 
369 (node_data, MAX_NUMNODES)
370 -------------------------
371 
372 The maximum number of nodes in system.
373 
374 KERNELOFFSET
375 ------------
376 
377 The kernel randomization offset. Used to compute the page offset. If
378 KASLR is disabled, this value is zero.
379 
380 KERNEL_IMAGE_SIZE
381 -----------------
382 
383 Currently unused by Makedumpfile. Used to compute the module virtual
384 address by Crash.
385 
386 sme_mask
387 --------
388 
389 AMD-specific with SME support: it indicates the secure memory encryption
390 mask. Makedumpfile tools need to know whether the crash kernel was
391 encrypted. If SME is enabled in the first kernel, the crash kernel's
392 page table entries (pgd/pud/pmd/pte) contain the memory encryption
393 mask. This is used to remove the SME mask and obtain the true physical
394 address.
395 
396 Currently, sme_mask stores the value of the C-bit position. If needed,
397 additional SME-relevant info can be placed in that variable.
398 
399 For example::
400 
401   [ misc	        ][ enc bit  ][ other misc SME info       ]
402   0000_0000_0000_0000_1000_0000_0000_0000_0000_0000_..._0000
403   63   59   55   51   47   43   39   35   31   27   ... 3
404 
405 x86_32
406 ======
407 
408 X86_PAE
409 -------
410 
411 Denotes whether physical address extensions are enabled. It has the cost
412 of a higher page table lookup overhead, and also consumes more page
413 table space per process. Used to check whether PAE was enabled in the
414 crash kernel when converting virtual addresses to physical addresses.
415 
416 ia64
417 ====
418 
419 pgdat_list|(pgdat_list, MAX_NUMNODES)
420 -------------------------------------
421 
422 pg_data_t array storing all NUMA nodes information. MAX_NUMNODES
423 indicates the number of the nodes.
424 
425 node_memblk|(node_memblk, NR_NODE_MEMBLKS)
426 ------------------------------------------
427 
428 List of node memory chunks. Filled when parsing the SRAT table to obtain
429 information about memory nodes. NR_NODE_MEMBLKS indicates the number of
430 node memory chunks.
431 
432 These values are used to compute the number of nodes the crashed kernel used.
433 
434 node_memblk_s|(node_memblk_s, start_paddr)|(node_memblk_s, size)
435 ----------------------------------------------------------------
436 
437 The size of a struct node_memblk_s and the offsets of the
438 node_memblk_s's members. Used to compute the number of nodes.
439 
440 PGTABLE_3|PGTABLE_4
441 -------------------
442 
443 User-space tools need to know whether the crash kernel was in 3-level or
444 4-level paging mode. Used to distinguish the page table.
445 
446 ARM64
447 =====
448 
449 VA_BITS
450 -------
451 
452 The maximum number of bits for virtual addresses. Used to compute the
453 virtual memory ranges.
454 
455 kimage_voffset
456 --------------
457 
458 The offset between the kernel virtual and physical mappings. Used to
459 translate virtual to physical addresses.
460 
461 PHYS_OFFSET
462 -----------
463 
464 Indicates the physical address of the start of memory. Similar to
465 kimage_voffset, which is used to translate virtual to physical
466 addresses.
467 
468 KERNELOFFSET
469 ------------
470 
471 The kernel randomization offset. Used to compute the page offset. If
472 KASLR is disabled, this value is zero.
473 
474 KERNELPACMASK
475 -------------
476 
477 The mask to extract the Pointer Authentication Code from a kernel virtual
478 address.
479 
480 TCR_EL1.T1SZ
481 ------------
482 
483 Indicates the size offset of the memory region addressed by TTBR1_EL1.
484 The region size is 2^(64-T1SZ) bytes.
485 
486 TTBR1_EL1 is the table base address register specified by ARMv8-A
487 architecture which is used to lookup the page-tables for the Virtual
488 addresses in the higher VA range (refer to ARMv8 ARM document for
489 more details).
490 
491 MODULES_VADDR|MODULES_END|VMALLOC_START|VMALLOC_END|VMEMMAP_START|VMEMMAP_END
492 -----------------------------------------------------------------------------
493 
494 Used to get the correct ranges:
495 	MODULES_VADDR ~ MODULES_END-1 : Kernel module space.
496 	VMALLOC_START ~ VMALLOC_END-1 : vmalloc() / ioremap() space.
497 	VMEMMAP_START ~ VMEMMAP_END-1 : vmemmap region, used for struct page array.
498 
499 arm
500 ===
501 
502 ARM_LPAE
503 --------
504 
505 It indicates whether the crash kernel supports large physical address
506 extensions. Used to translate virtual to physical addresses.
507 
508 s390
509 ====
510 
511 lowcore_ptr
512 -----------
513 
514 An array with a pointer to the lowcore of every CPU. Used to print the
515 psw and all registers information.
516 
517 high_memory
518 -----------
519 
520 Used to get the vmalloc_start address from the high_memory symbol.
521 
522 (lowcore_ptr, NR_CPUS)
523 ----------------------
524 
525 The maximum number of CPUs.
526 
527 powerpc
528 =======
529 
530 
531 node_data|(node_data, MAX_NUMNODES)
532 -----------------------------------
533 
534 See above.
535 
536 contig_page_data
537 ----------------
538 
539 See above.
540 
541 vmemmap_list
542 ------------
543 
544 The vmemmap_list maintains the entire vmemmap physical mapping. Used
545 to get vmemmap list count and populated vmemmap regions info. If the
546 vmemmap address translation information is stored in the crash kernel,
547 it is used to translate vmemmap kernel virtual addresses.
548 
549 mmu_vmemmap_psize
550 -----------------
551 
552 The size of a page. Used to translate virtual to physical addresses.
553 
554 mmu_psize_defs
555 --------------
556 
557 Page size definitions, i.e. 4k, 64k, or 16M.
558 
559 Used to make vtop translations.
560 
561 vmemmap_backing|(vmemmap_backing, list)|(vmemmap_backing, phys)|(vmemmap_backing, virt_addr)
562 --------------------------------------------------------------------------------------------
563 
564 The vmemmap virtual address space management does not have a traditional
565 page table to track which virtual struct pages are backed by a physical
566 mapping. The virtual to physical mappings are tracked in a simple linked
567 list format.
568 
569 User-space tools need to know the offset of list, phys and virt_addr
570 when computing the count of vmemmap regions.
571 
572 mmu_psize_def|(mmu_psize_def, shift)
573 ------------------------------------
574 
575 The size of a struct mmu_psize_def and the offset of mmu_psize_def's
576 member.
577 
578 Used in vtop translations.
579 
580 sh
581 ==
582 
583 node_data|(node_data, MAX_NUMNODES)
584 -----------------------------------
585 
586 See above.
587 
588 X2TLB
589 -----
590 
591 Indicates whether the crashed kernel enabled SH extended mode.
592 
593 RISCV64
594 =======
595 
596 VA_BITS
597 -------
598 
599 The maximum number of bits for virtual addresses. Used to compute the
600 virtual memory ranges.
601 
602 PAGE_OFFSET
603 -----------
604 
605 Indicates the virtual kernel start address of the direct-mapped RAM region.
606 
607 phys_ram_base
608 -------------
609 
610 Indicates the start physical RAM address.
611 
612 MODULES_VADDR|MODULES_END|VMALLOC_START|VMALLOC_END|VMEMMAP_START|VMEMMAP_END|KERNEL_LINK_ADDR
613 ----------------------------------------------------------------------------------------------
614 
615 Used to get the correct ranges:
616 
617   * MODULES_VADDR ~ MODULES_END : Kernel module space.
618   * VMALLOC_START ~ VMALLOC_END : vmalloc() / ioremap() space.
619   * VMEMMAP_START ~ VMEMMAP_END : vmemmap space, used for struct page array.
620   * KERNEL_LINK_ADDR : start address of Kernel link and BPF
621 
622 va_kernel_pa_offset
623 -------------------
624 
625 Indicates the offset between the kernel virtual and physical mappings.
626 Used to translate virtual to physical addresses.
627