1============================== 2Memory Layout on AArch64 Linux 3============================== 4 5Author: Catalin Marinas <catalin.marinas@arm.com> 6 7This document describes the virtual memory layout used by the AArch64 8Linux kernel. The architecture allows up to 4 levels of translation 9tables with a 4KB page size and up to 3 levels with a 64KB page size. 10 11AArch64 Linux uses either 3 levels or 4 levels of translation tables 12with the 4KB page configuration, allowing 39-bit (512GB) or 48-bit 13(256TB) virtual addresses, respectively, for both user and kernel. With 1464KB pages, only 2 levels of translation tables, allowing 42-bit (4TB) 15virtual address, are used but the memory layout is the same. 16 17ARMv8.2 adds optional support for Large Virtual Address space. This is 18only available when running with a 64KB page size and expands the 19number of descriptors in the first level of translation. 20 21User addresses have bits 63:48 set to 0 while the kernel addresses have 22the same bits set to 1. TTBRx selection is given by bit 63 of the 23virtual address. The swapper_pg_dir contains only kernel (global) 24mappings while the user pgd contains only user (non-global) mappings. 25The swapper_pg_dir address is written to TTBR1 and never written to 26TTBR0. 27 28 29AArch64 Linux memory layout with 4KB pages + 4 levels (48-bit):: 30 31 Start End Size Use 32 ----------------------------------------------------------------------- 33 0000000000000000 0000ffffffffffff 256TB user 34 ffff000000000000 ffff7fffffffffff 128TB kernel logical memory map 35 [ffff600000000000 ffff7fffffffffff] 32TB [kasan shadow region] 36 ffff800000000000 ffff80007fffffff 2GB modules 37 ffff800080000000 fffffbffefffffff 124TB vmalloc 38 fffffbfff0000000 fffffbfffdffffff 224MB fixed mappings (top down) 39 fffffbfffe000000 fffffbfffe7fffff 8MB [guard region] 40 fffffbfffe800000 fffffbffff7fffff 16MB PCI I/O space 41 fffffbffff800000 fffffbffffffffff 8MB [guard region] 42 fffffc0000000000 fffffdffffffffff 2TB vmemmap 43 fffffe0000000000 ffffffffffffffff 2TB [guard region] 44 45 46AArch64 Linux memory layout with 64KB pages + 3 levels (52-bit with HW support):: 47 48 Start End Size Use 49 ----------------------------------------------------------------------- 50 0000000000000000 000fffffffffffff 4PB user 51 fff0000000000000 ffff7fffffffffff ~4PB kernel logical memory map 52 [fffd800000000000 ffff7fffffffffff] 512TB [kasan shadow region] 53 ffff800000000000 ffff80007fffffff 2GB modules 54 ffff800080000000 fffffbffefffffff 124TB vmalloc 55 fffffbfff0000000 fffffbfffdffffff 224MB fixed mappings (top down) 56 fffffbfffe000000 fffffbfffe7fffff 8MB [guard region] 57 fffffbfffe800000 fffffbffff7fffff 16MB PCI I/O space 58 fffffbffff800000 fffffbffffffffff 8MB [guard region] 59 fffffc0000000000 ffffffdfffffffff ~4TB vmemmap 60 ffffffe000000000 ffffffffffffffff 128GB [guard region] 61 62 63Translation table lookup with 4KB pages:: 64 65 +--------+--------+--------+--------+--------+--------+--------+--------+ 66 |63 56|55 48|47 40|39 32|31 24|23 16|15 8|7 0| 67 +--------+--------+--------+--------+--------+--------+--------+--------+ 68 | | | | | | 69 | | | | | v 70 | | | | | [11:0] in-page offset 71 | | | | +-> [20:12] L3 index 72 | | | +-----------> [29:21] L2 index 73 | | +---------------------> [38:30] L1 index 74 | +-------------------------------> [47:39] L0 index 75 +-------------------------------------------------> [63] TTBR0/1 76 77 78Translation table lookup with 64KB pages:: 79 80 +--------+--------+--------+--------+--------+--------+--------+--------+ 81 |63 56|55 48|47 40|39 32|31 24|23 16|15 8|7 0| 82 +--------+--------+--------+--------+--------+--------+--------+--------+ 83 | | | | | 84 | | | | v 85 | | | | [15:0] in-page offset 86 | | | +----------> [28:16] L3 index 87 | | +--------------------------> [41:29] L2 index 88 | +-------------------------------> [47:42] L1 index (48-bit) 89 | [51:42] L1 index (52-bit) 90 +-------------------------------------------------> [63] TTBR0/1 91 92 93When using KVM without the Virtualization Host Extensions, the 94hypervisor maps kernel pages in EL2 at a fixed (and potentially 95random) offset from the linear mapping. See the kern_hyp_va macro and 96kvm_update_va_mask function for more details. MMIO devices such as 97GICv2 gets mapped next to the HYP idmap page, as do vectors when 98ARM64_SPECTRE_V3A is enabled for particular CPUs. 99 100When using KVM with the Virtualization Host Extensions, no additional 101mappings are created, since the host kernel runs directly in EL2. 102 10352-bit VA support in the kernel 104------------------------------- 105If the ARMv8.2-LVA optional feature is present, and we are running 106with a 64KB page size; then it is possible to use 52-bits of address 107space for both userspace and kernel addresses. However, any kernel 108binary that supports 52-bit must also be able to fall back to 48-bit 109at early boot time if the hardware feature is not present. 110 111This fallback mechanism necessitates the kernel .text to be in the 112higher addresses such that they are invariant to 48/52-bit VAs. Due 113to the kasan shadow being a fraction of the entire kernel VA space, 114the end of the kasan shadow must also be in the higher half of the 115kernel VA space for both 48/52-bit. (Switching from 48-bit to 52-bit, 116the end of the kasan shadow is invariant and dependent on ~0UL, 117whilst the start address will "grow" towards the lower addresses). 118 119In order to optimise phys_to_virt and virt_to_phys, the PAGE_OFFSET 120is kept constant at 0xFFF0000000000000 (corresponding to 52-bit), 121this obviates the need for an extra variable read. The physvirt 122offset and vmemmap offsets are computed at early boot to enable 123this logic. 124 125As a single binary will need to support both 48-bit and 52-bit VA 126spaces, the VMEMMAP must be sized large enough for 52-bit VAs and 127also must be sized large enough to accommodate a fixed PAGE_OFFSET. 128 129Most code in the kernel should not need to consider the VA_BITS, for 130code that does need to know the VA size the variables are 131defined as follows: 132 133VA_BITS constant the *maximum* VA space size 134 135VA_BITS_MIN constant the *minimum* VA space size 136 137vabits_actual variable the *actual* VA space size 138 139 140Maximum and minimum sizes can be useful to ensure that buffers are 141sized large enough or that addresses are positioned close enough for 142the "worst" case. 143 14452-bit userspace VAs 145-------------------- 146To maintain compatibility with software that relies on the ARMv8.0 147VA space maximum size of 48-bits, the kernel will, by default, 148return virtual addresses to userspace from a 48-bit range. 149 150Software can "opt-in" to receiving VAs from a 52-bit space by 151specifying an mmap hint parameter that is larger than 48-bit. 152 153For example: 154 155.. code-block:: c 156 157 maybe_high_address = mmap(~0UL, size, prot, flags,...); 158 159It is also possible to build a debug kernel that returns addresses 160from a 52-bit space by enabling the following kernel config options: 161 162.. code-block:: sh 163 164 CONFIG_EXPERT=y && CONFIG_ARM64_FORCE_52BIT=y 165 166Note that this option is only intended for debugging applications 167and should not be used in production. 168