1.. SPDX-License-Identifier: GPL-2.0 2 3====================== 4Memory Protection Keys 5====================== 6 7Memory Protection Keys provide a mechanism for enforcing page-based 8protections, but without requiring modification of the page tables when an 9application changes protection domains. 10 11Pkeys Userspace (PKU) is a feature which can be found on: 12 * Intel server CPUs, Skylake and later 13 * Intel client CPUs, Tiger Lake (11th Gen Core) and later 14 * Future AMD CPUs 15 16Pkeys work by dedicating 4 previously Reserved bits in each page table entry to 17a "protection key", giving 16 possible keys. 18 19Protections for each key are defined with a per-CPU user-accessible register 20(PKRU). Each of these is a 32-bit register storing two bits (Access Disable 21and Write Disable) for each of 16 keys. 22 23Being a CPU register, PKRU is inherently thread-local, potentially giving each 24thread a different set of protections from every other thread. 25 26There are two instructions (RDPKRU/WRPKRU) for reading and writing to the 27register. The feature is only available in 64-bit mode, even though there is 28theoretically space in the PAE PTEs. These permissions are enforced on data 29access only and have no effect on instruction fetches. 30 31Syscalls 32======== 33 34There are 3 system calls which directly interact with pkeys:: 35 36 int pkey_alloc(unsigned long flags, unsigned long init_access_rights) 37 int pkey_free(int pkey); 38 int pkey_mprotect(unsigned long start, size_t len, 39 unsigned long prot, int pkey); 40 41Before a pkey can be used, it must first be allocated with 42pkey_alloc(). An application calls the WRPKRU instruction 43directly in order to change access permissions to memory covered 44with a key. In this example WRPKRU is wrapped by a C function 45called pkey_set(). 46:: 47 48 int real_prot = PROT_READ|PROT_WRITE; 49 pkey = pkey_alloc(0, PKEY_DISABLE_WRITE); 50 ptr = mmap(NULL, PAGE_SIZE, PROT_NONE, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0); 51 ret = pkey_mprotect(ptr, PAGE_SIZE, real_prot, pkey); 52 ... application runs here 53 54Now, if the application needs to update the data at 'ptr', it can 55gain access, do the update, then remove its write access:: 56 57 pkey_set(pkey, 0); // clear PKEY_DISABLE_WRITE 58 *ptr = foo; // assign something 59 pkey_set(pkey, PKEY_DISABLE_WRITE); // set PKEY_DISABLE_WRITE again 60 61Now when it frees the memory, it will also free the pkey since it 62is no longer in use:: 63 64 munmap(ptr, PAGE_SIZE); 65 pkey_free(pkey); 66 67.. note:: pkey_set() is a wrapper for the RDPKRU and WRPKRU instructions. 68 An example implementation can be found in 69 tools/testing/selftests/x86/protection_keys.c. 70 71Behavior 72======== 73 74The kernel attempts to make protection keys consistent with the 75behavior of a plain mprotect(). For instance if you do this:: 76 77 mprotect(ptr, size, PROT_NONE); 78 something(ptr); 79 80you can expect the same effects with protection keys when doing this:: 81 82 pkey = pkey_alloc(0, PKEY_DISABLE_WRITE | PKEY_DISABLE_READ); 83 pkey_mprotect(ptr, size, PROT_READ|PROT_WRITE, pkey); 84 something(ptr); 85 86That should be true whether something() is a direct access to 'ptr' 87like:: 88 89 *ptr = foo; 90 91or when the kernel does the access on the application's behalf like 92with a read():: 93 94 read(fd, ptr, 1); 95 96The kernel will send a SIGSEGV in both cases, but si_code will be set 97to SEGV_PKERR when violating protection keys versus SEGV_ACCERR when 98the plain mprotect() permissions are violated. 99