xref: /openbmc/linux/Documentation/arch/x86/x86_64/mm.rst (revision 1ac731c529cd4d6adbce134754b51ff7d822b145)
1*ff61f079SJonathan Corbet.. SPDX-License-Identifier: GPL-2.0
2*ff61f079SJonathan Corbet
3*ff61f079SJonathan Corbet=================
4*ff61f079SJonathan CorbetMemory Management
5*ff61f079SJonathan Corbet=================
6*ff61f079SJonathan Corbet
7*ff61f079SJonathan CorbetComplete virtual memory map with 4-level page tables
8*ff61f079SJonathan Corbet====================================================
9*ff61f079SJonathan Corbet
10*ff61f079SJonathan Corbet.. note::
11*ff61f079SJonathan Corbet
12*ff61f079SJonathan Corbet - Negative addresses such as "-23 TB" are absolute addresses in bytes, counted down
13*ff61f079SJonathan Corbet   from the top of the 64-bit address space. It's easier to understand the layout
14*ff61f079SJonathan Corbet   when seen both in absolute addresses and in distance-from-top notation.
15*ff61f079SJonathan Corbet
16*ff61f079SJonathan Corbet   For example 0xffffe90000000000 == -23 TB, it's 23 TB lower than the top of the
17*ff61f079SJonathan Corbet   64-bit address space (ffffffffffffffff).
18*ff61f079SJonathan Corbet
19*ff61f079SJonathan Corbet   Note that as we get closer to the top of the address space, the notation changes
20*ff61f079SJonathan Corbet   from TB to GB and then MB/KB.
21*ff61f079SJonathan Corbet
22*ff61f079SJonathan Corbet - "16M TB" might look weird at first sight, but it's an easier way to visualize size
23*ff61f079SJonathan Corbet   notation than "16 EB", which few will recognize at first sight as 16 exabytes.
24*ff61f079SJonathan Corbet   It also shows it nicely how incredibly large 64-bit address space is.
25*ff61f079SJonathan Corbet
26*ff61f079SJonathan Corbet::
27*ff61f079SJonathan Corbet
28*ff61f079SJonathan Corbet  ========================================================================================================================
29*ff61f079SJonathan Corbet      Start addr    |   Offset   |     End addr     |  Size   | VM area description
30*ff61f079SJonathan Corbet  ========================================================================================================================
31*ff61f079SJonathan Corbet                    |            |                  |         |
32*ff61f079SJonathan Corbet   0000000000000000 |    0       | 00007fffffffffff |  128 TB | user-space virtual memory, different per mm
33*ff61f079SJonathan Corbet  __________________|____________|__________________|_________|___________________________________________________________
34*ff61f079SJonathan Corbet                    |            |                  |         |
35*ff61f079SJonathan Corbet   0000800000000000 | +128    TB | ffff7fffffffffff | ~16M TB | ... huge, almost 64 bits wide hole of non-canonical
36*ff61f079SJonathan Corbet                    |            |                  |         |     virtual memory addresses up to the -128 TB
37*ff61f079SJonathan Corbet                    |            |                  |         |     starting offset of kernel mappings.
38*ff61f079SJonathan Corbet  __________________|____________|__________________|_________|___________________________________________________________
39*ff61f079SJonathan Corbet                                                              |
40*ff61f079SJonathan Corbet                                                              | Kernel-space virtual memory, shared between all processes:
41*ff61f079SJonathan Corbet  ____________________________________________________________|___________________________________________________________
42*ff61f079SJonathan Corbet                    |            |                  |         |
43*ff61f079SJonathan Corbet   ffff800000000000 | -128    TB | ffff87ffffffffff |    8 TB | ... guard hole, also reserved for hypervisor
44*ff61f079SJonathan Corbet   ffff880000000000 | -120    TB | ffff887fffffffff |  0.5 TB | LDT remap for PTI
45*ff61f079SJonathan Corbet   ffff888000000000 | -119.5  TB | ffffc87fffffffff |   64 TB | direct mapping of all physical memory (page_offset_base)
46*ff61f079SJonathan Corbet   ffffc88000000000 |  -55.5  TB | ffffc8ffffffffff |  0.5 TB | ... unused hole
47*ff61f079SJonathan Corbet   ffffc90000000000 |  -55    TB | ffffe8ffffffffff |   32 TB | vmalloc/ioremap space (vmalloc_base)
48*ff61f079SJonathan Corbet   ffffe90000000000 |  -23    TB | ffffe9ffffffffff |    1 TB | ... unused hole
49*ff61f079SJonathan Corbet   ffffea0000000000 |  -22    TB | ffffeaffffffffff |    1 TB | virtual memory map (vmemmap_base)
50*ff61f079SJonathan Corbet   ffffeb0000000000 |  -21    TB | ffffebffffffffff |    1 TB | ... unused hole
51*ff61f079SJonathan Corbet   ffffec0000000000 |  -20    TB | fffffbffffffffff |   16 TB | KASAN shadow memory
52*ff61f079SJonathan Corbet  __________________|____________|__________________|_________|____________________________________________________________
53*ff61f079SJonathan Corbet                                                              |
54*ff61f079SJonathan Corbet                                                              | Identical layout to the 56-bit one from here on:
55*ff61f079SJonathan Corbet  ____________________________________________________________|____________________________________________________________
56*ff61f079SJonathan Corbet                    |            |                  |         |
57*ff61f079SJonathan Corbet   fffffc0000000000 |   -4    TB | fffffdffffffffff |    2 TB | ... unused hole
58*ff61f079SJonathan Corbet                    |            |                  |         | vaddr_end for KASLR
59*ff61f079SJonathan Corbet   fffffe0000000000 |   -2    TB | fffffe7fffffffff |  0.5 TB | cpu_entry_area mapping
60*ff61f079SJonathan Corbet   fffffe8000000000 |   -1.5  TB | fffffeffffffffff |  0.5 TB | ... unused hole
61*ff61f079SJonathan Corbet   ffffff0000000000 |   -1    TB | ffffff7fffffffff |  0.5 TB | %esp fixup stacks
62*ff61f079SJonathan Corbet   ffffff8000000000 | -512    GB | ffffffeeffffffff |  444 GB | ... unused hole
63*ff61f079SJonathan Corbet   ffffffef00000000 |  -68    GB | fffffffeffffffff |   64 GB | EFI region mapping space
64*ff61f079SJonathan Corbet   ffffffff00000000 |   -4    GB | ffffffff7fffffff |    2 GB | ... unused hole
65*ff61f079SJonathan Corbet   ffffffff80000000 |   -2    GB | ffffffff9fffffff |  512 MB | kernel text mapping, mapped to physical address 0
66*ff61f079SJonathan Corbet   ffffffff80000000 |-2048    MB |                  |         |
67*ff61f079SJonathan Corbet   ffffffffa0000000 |-1536    MB | fffffffffeffffff | 1520 MB | module mapping space
68*ff61f079SJonathan Corbet   ffffffffff000000 |  -16    MB |                  |         |
69*ff61f079SJonathan Corbet      FIXADDR_START | ~-11    MB | ffffffffff5fffff | ~0.5 MB | kernel-internal fixmap range, variable size and offset
70*ff61f079SJonathan Corbet   ffffffffff600000 |  -10    MB | ffffffffff600fff |    4 kB | legacy vsyscall ABI
71*ff61f079SJonathan Corbet   ffffffffffe00000 |   -2    MB | ffffffffffffffff |    2 MB | ... unused hole
72*ff61f079SJonathan Corbet  __________________|____________|__________________|_________|___________________________________________________________
73*ff61f079SJonathan Corbet
74*ff61f079SJonathan Corbet
75*ff61f079SJonathan CorbetComplete virtual memory map with 5-level page tables
76*ff61f079SJonathan Corbet====================================================
77*ff61f079SJonathan Corbet
78*ff61f079SJonathan Corbet.. note::
79*ff61f079SJonathan Corbet
80*ff61f079SJonathan Corbet - With 56-bit addresses, user-space memory gets expanded by a factor of 512x,
81*ff61f079SJonathan Corbet   from 0.125 PB to 64 PB. All kernel mappings shift down to the -64 PB starting
82*ff61f079SJonathan Corbet   offset and many of the regions expand to support the much larger physical
83*ff61f079SJonathan Corbet   memory supported.
84*ff61f079SJonathan Corbet
85*ff61f079SJonathan Corbet::
86*ff61f079SJonathan Corbet
87*ff61f079SJonathan Corbet  ========================================================================================================================
88*ff61f079SJonathan Corbet      Start addr    |   Offset   |     End addr     |  Size   | VM area description
89*ff61f079SJonathan Corbet  ========================================================================================================================
90*ff61f079SJonathan Corbet                    |            |                  |         |
91*ff61f079SJonathan Corbet   0000000000000000 |    0       | 00ffffffffffffff |   64 PB | user-space virtual memory, different per mm
92*ff61f079SJonathan Corbet  __________________|____________|__________________|_________|___________________________________________________________
93*ff61f079SJonathan Corbet                    |            |                  |         |
94*ff61f079SJonathan Corbet   0100000000000000 |  +64    PB | feffffffffffffff | ~16K PB | ... huge, still almost 64 bits wide hole of non-canonical
95*ff61f079SJonathan Corbet                    |            |                  |         |     virtual memory addresses up to the -64 PB
96*ff61f079SJonathan Corbet                    |            |                  |         |     starting offset of kernel mappings.
97*ff61f079SJonathan Corbet  __________________|____________|__________________|_________|___________________________________________________________
98*ff61f079SJonathan Corbet                                                              |
99*ff61f079SJonathan Corbet                                                              | Kernel-space virtual memory, shared between all processes:
100*ff61f079SJonathan Corbet  ____________________________________________________________|___________________________________________________________
101*ff61f079SJonathan Corbet                    |            |                  |         |
102*ff61f079SJonathan Corbet   ff00000000000000 |  -64    PB | ff0fffffffffffff |    4 PB | ... guard hole, also reserved for hypervisor
103*ff61f079SJonathan Corbet   ff10000000000000 |  -60    PB | ff10ffffffffffff | 0.25 PB | LDT remap for PTI
104*ff61f079SJonathan Corbet   ff11000000000000 |  -59.75 PB | ff90ffffffffffff |   32 PB | direct mapping of all physical memory (page_offset_base)
105*ff61f079SJonathan Corbet   ff91000000000000 |  -27.75 PB | ff9fffffffffffff | 3.75 PB | ... unused hole
106*ff61f079SJonathan Corbet   ffa0000000000000 |  -24    PB | ffd1ffffffffffff | 12.5 PB | vmalloc/ioremap space (vmalloc_base)
107*ff61f079SJonathan Corbet   ffd2000000000000 |  -11.5  PB | ffd3ffffffffffff |  0.5 PB | ... unused hole
108*ff61f079SJonathan Corbet   ffd4000000000000 |  -11    PB | ffd5ffffffffffff |  0.5 PB | virtual memory map (vmemmap_base)
109*ff61f079SJonathan Corbet   ffd6000000000000 |  -10.5  PB | ffdeffffffffffff | 2.25 PB | ... unused hole
110*ff61f079SJonathan Corbet   ffdf000000000000 |   -8.25 PB | fffffbffffffffff |   ~8 PB | KASAN shadow memory
111*ff61f079SJonathan Corbet  __________________|____________|__________________|_________|____________________________________________________________
112*ff61f079SJonathan Corbet                                                              |
113*ff61f079SJonathan Corbet                                                              | Identical layout to the 47-bit one from here on:
114*ff61f079SJonathan Corbet  ____________________________________________________________|____________________________________________________________
115*ff61f079SJonathan Corbet                    |            |                  |         |
116*ff61f079SJonathan Corbet   fffffc0000000000 |   -4    TB | fffffdffffffffff |    2 TB | ... unused hole
117*ff61f079SJonathan Corbet                    |            |                  |         | vaddr_end for KASLR
118*ff61f079SJonathan Corbet   fffffe0000000000 |   -2    TB | fffffe7fffffffff |  0.5 TB | cpu_entry_area mapping
119*ff61f079SJonathan Corbet   fffffe8000000000 |   -1.5  TB | fffffeffffffffff |  0.5 TB | ... unused hole
120*ff61f079SJonathan Corbet   ffffff0000000000 |   -1    TB | ffffff7fffffffff |  0.5 TB | %esp fixup stacks
121*ff61f079SJonathan Corbet   ffffff8000000000 | -512    GB | ffffffeeffffffff |  444 GB | ... unused hole
122*ff61f079SJonathan Corbet   ffffffef00000000 |  -68    GB | fffffffeffffffff |   64 GB | EFI region mapping space
123*ff61f079SJonathan Corbet   ffffffff00000000 |   -4    GB | ffffffff7fffffff |    2 GB | ... unused hole
124*ff61f079SJonathan Corbet   ffffffff80000000 |   -2    GB | ffffffff9fffffff |  512 MB | kernel text mapping, mapped to physical address 0
125*ff61f079SJonathan Corbet   ffffffff80000000 |-2048    MB |                  |         |
126*ff61f079SJonathan Corbet   ffffffffa0000000 |-1536    MB | fffffffffeffffff | 1520 MB | module mapping space
127*ff61f079SJonathan Corbet   ffffffffff000000 |  -16    MB |                  |         |
128*ff61f079SJonathan Corbet      FIXADDR_START | ~-11    MB | ffffffffff5fffff | ~0.5 MB | kernel-internal fixmap range, variable size and offset
129*ff61f079SJonathan Corbet   ffffffffff600000 |  -10    MB | ffffffffff600fff |    4 kB | legacy vsyscall ABI
130*ff61f079SJonathan Corbet   ffffffffffe00000 |   -2    MB | ffffffffffffffff |    2 MB | ... unused hole
131*ff61f079SJonathan Corbet  __________________|____________|__________________|_________|___________________________________________________________
132*ff61f079SJonathan Corbet
133*ff61f079SJonathan CorbetArchitecture defines a 64-bit virtual address. Implementations can support
134*ff61f079SJonathan Corbetless. Currently supported are 48- and 57-bit virtual addresses. Bits 63
135*ff61f079SJonathan Corbetthrough to the most-significant implemented bit are sign extended.
136*ff61f079SJonathan CorbetThis causes hole between user space and kernel addresses if you interpret them
137*ff61f079SJonathan Corbetas unsigned.
138*ff61f079SJonathan Corbet
139*ff61f079SJonathan CorbetThe direct mapping covers all memory in the system up to the highest
140*ff61f079SJonathan Corbetmemory address (this means in some cases it can also include PCI memory
141*ff61f079SJonathan Corbetholes).
142*ff61f079SJonathan Corbet
143*ff61f079SJonathan CorbetWe map EFI runtime services in the 'efi_pgd' PGD in a 64GB large virtual
144*ff61f079SJonathan Corbetmemory window (this size is arbitrary, it can be raised later if needed).
145*ff61f079SJonathan CorbetThe mappings are not part of any other kernel PGD and are only available
146*ff61f079SJonathan Corbetduring EFI runtime calls.
147*ff61f079SJonathan Corbet
148*ff61f079SJonathan CorbetNote that if CONFIG_RANDOMIZE_MEMORY is enabled, the direct mapping of all
149*ff61f079SJonathan Corbetphysical memory, vmalloc/ioremap space and virtual memory map are randomized.
150*ff61f079SJonathan CorbetTheir order is preserved but their base will be offset early at boot time.
151*ff61f079SJonathan Corbet
152*ff61f079SJonathan CorbetBe very careful vs. KASLR when changing anything here. The KASLR address
153*ff61f079SJonathan Corbetrange must not overlap with anything except the KASAN shadow area, which is
154*ff61f079SJonathan Corbetcorrect as KASAN disables KASLR.
155*ff61f079SJonathan Corbet
156*ff61f079SJonathan CorbetFor both 4- and 5-level layouts, the STACKLEAK_POISON value in the last 2MB
157*ff61f079SJonathan Corbethole: ffffffffffff4111
158