1.. SPDX-License-Identifier: GPL-2.0 2.. Copyright © 2017-2020 Mickaël Salaün <mic@digikod.net> 3.. Copyright © 2019-2020 ANSSI 4.. Copyright © 2021-2022 Microsoft Corporation 5 6===================================== 7Landlock: unprivileged access control 8===================================== 9 10:Author: Mickaël Salaün 11:Date: October 2022 12 13The goal of Landlock is to enable to restrict ambient rights (e.g. global 14filesystem access) for a set of processes. Because Landlock is a stackable 15LSM, it makes possible to create safe security sandboxes as new security layers 16in addition to the existing system-wide access-controls. This kind of sandbox 17is expected to help mitigate the security impact of bugs or 18unexpected/malicious behaviors in user space applications. Landlock empowers 19any process, including unprivileged ones, to securely restrict themselves. 20 21We can quickly make sure that Landlock is enabled in the running system by 22looking for "landlock: Up and running" in kernel logs (as root): ``dmesg | grep 23landlock || journalctl -kg landlock`` . Developers can also easily check for 24Landlock support with a :ref:`related system call <landlock_abi_versions>`. If 25Landlock is not currently supported, we need to :ref:`configure the kernel 26appropriately <kernel_support>`. 27 28Landlock rules 29============== 30 31A Landlock rule describes an action on an object. An object is currently a 32file hierarchy, and the related filesystem actions are defined with `access 33rights`_. A set of rules is aggregated in a ruleset, which can then restrict 34the thread enforcing it, and its future children. 35 36Defining and enforcing a security policy 37---------------------------------------- 38 39We first need to define the ruleset that will contain our rules. For this 40example, the ruleset will contain rules that only allow read actions, but write 41actions will be denied. The ruleset then needs to handle both of these kind of 42actions. This is required for backward and forward compatibility (i.e. the 43kernel and user space may not know each other's supported restrictions), hence 44the need to be explicit about the denied-by-default access rights. 45 46.. code-block:: c 47 48 struct landlock_ruleset_attr ruleset_attr = { 49 .handled_access_fs = 50 LANDLOCK_ACCESS_FS_EXECUTE | 51 LANDLOCK_ACCESS_FS_WRITE_FILE | 52 LANDLOCK_ACCESS_FS_READ_FILE | 53 LANDLOCK_ACCESS_FS_READ_DIR | 54 LANDLOCK_ACCESS_FS_REMOVE_DIR | 55 LANDLOCK_ACCESS_FS_REMOVE_FILE | 56 LANDLOCK_ACCESS_FS_MAKE_CHAR | 57 LANDLOCK_ACCESS_FS_MAKE_DIR | 58 LANDLOCK_ACCESS_FS_MAKE_REG | 59 LANDLOCK_ACCESS_FS_MAKE_SOCK | 60 LANDLOCK_ACCESS_FS_MAKE_FIFO | 61 LANDLOCK_ACCESS_FS_MAKE_BLOCK | 62 LANDLOCK_ACCESS_FS_MAKE_SYM | 63 LANDLOCK_ACCESS_FS_REFER | 64 LANDLOCK_ACCESS_FS_TRUNCATE, 65 }; 66 67Because we may not know on which kernel version an application will be 68executed, it is safer to follow a best-effort security approach. Indeed, we 69should try to protect users as much as possible whatever the kernel they are 70using. To avoid binary enforcement (i.e. either all security features or 71none), we can leverage a dedicated Landlock command to get the current version 72of the Landlock ABI and adapt the handled accesses. Let's check if we should 73remove the ``LANDLOCK_ACCESS_FS_REFER`` or ``LANDLOCK_ACCESS_FS_TRUNCATE`` 74access rights, which are only supported starting with the second and third 75version of the ABI. 76 77.. code-block:: c 78 79 int abi; 80 81 abi = landlock_create_ruleset(NULL, 0, LANDLOCK_CREATE_RULESET_VERSION); 82 if (abi < 0) { 83 /* Degrades gracefully if Landlock is not handled. */ 84 perror("The running kernel does not enable to use Landlock"); 85 return 0; 86 } 87 switch (abi) { 88 case 1: 89 /* Removes LANDLOCK_ACCESS_FS_REFER for ABI < 2 */ 90 ruleset_attr.handled_access_fs &= ~LANDLOCK_ACCESS_FS_REFER; 91 __attribute__((fallthrough)); 92 case 2: 93 /* Removes LANDLOCK_ACCESS_FS_TRUNCATE for ABI < 3 */ 94 ruleset_attr.handled_access_fs &= ~LANDLOCK_ACCESS_FS_TRUNCATE; 95 } 96 97This enables to create an inclusive ruleset that will contain our rules. 98 99.. code-block:: c 100 101 int ruleset_fd; 102 103 ruleset_fd = landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0); 104 if (ruleset_fd < 0) { 105 perror("Failed to create a ruleset"); 106 return 1; 107 } 108 109We can now add a new rule to this ruleset thanks to the returned file 110descriptor referring to this ruleset. The rule will only allow reading the 111file hierarchy ``/usr``. Without another rule, write actions would then be 112denied by the ruleset. To add ``/usr`` to the ruleset, we open it with the 113``O_PATH`` flag and fill the &struct landlock_path_beneath_attr with this file 114descriptor. 115 116.. code-block:: c 117 118 int err; 119 struct landlock_path_beneath_attr path_beneath = { 120 .allowed_access = 121 LANDLOCK_ACCESS_FS_EXECUTE | 122 LANDLOCK_ACCESS_FS_READ_FILE | 123 LANDLOCK_ACCESS_FS_READ_DIR, 124 }; 125 126 path_beneath.parent_fd = open("/usr", O_PATH | O_CLOEXEC); 127 if (path_beneath.parent_fd < 0) { 128 perror("Failed to open file"); 129 close(ruleset_fd); 130 return 1; 131 } 132 err = landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH, 133 &path_beneath, 0); 134 close(path_beneath.parent_fd); 135 if (err) { 136 perror("Failed to update ruleset"); 137 close(ruleset_fd); 138 return 1; 139 } 140 141It may also be required to create rules following the same logic as explained 142for the ruleset creation, by filtering access rights according to the Landlock 143ABI version. In this example, this is not required because all of the requested 144``allowed_access`` rights are already available in ABI 1. 145 146We now have a ruleset with one rule allowing read access to ``/usr`` while 147denying all other handled accesses for the filesystem. The next step is to 148restrict the current thread from gaining more privileges (e.g. thanks to a SUID 149binary). 150 151.. code-block:: c 152 153 if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0)) { 154 perror("Failed to restrict privileges"); 155 close(ruleset_fd); 156 return 1; 157 } 158 159The current thread is now ready to sandbox itself with the ruleset. 160 161.. code-block:: c 162 163 if (landlock_restrict_self(ruleset_fd, 0)) { 164 perror("Failed to enforce ruleset"); 165 close(ruleset_fd); 166 return 1; 167 } 168 close(ruleset_fd); 169 170If the ``landlock_restrict_self`` system call succeeds, the current thread is 171now restricted and this policy will be enforced on all its subsequently created 172children as well. Once a thread is landlocked, there is no way to remove its 173security policy; only adding more restrictions is allowed. These threads are 174now in a new Landlock domain, merge of their parent one (if any) with the new 175ruleset. 176 177Full working code can be found in `samples/landlock/sandboxer.c`_. 178 179Good practices 180-------------- 181 182It is recommended setting access rights to file hierarchy leaves as much as 183possible. For instance, it is better to be able to have ``~/doc/`` as a 184read-only hierarchy and ``~/tmp/`` as a read-write hierarchy, compared to 185``~/`` as a read-only hierarchy and ``~/tmp/`` as a read-write hierarchy. 186Following this good practice leads to self-sufficient hierarchies that do not 187depend on their location (i.e. parent directories). This is particularly 188relevant when we want to allow linking or renaming. Indeed, having consistent 189access rights per directory enables to change the location of such directory 190without relying on the destination directory access rights (except those that 191are required for this operation, see ``LANDLOCK_ACCESS_FS_REFER`` 192documentation). 193Having self-sufficient hierarchies also helps to tighten the required access 194rights to the minimal set of data. This also helps avoid sinkhole directories, 195i.e. directories where data can be linked to but not linked from. However, 196this depends on data organization, which might not be controlled by developers. 197In this case, granting read-write access to ``~/tmp/``, instead of write-only 198access, would potentially allow to move ``~/tmp/`` to a non-readable directory 199and still keep the ability to list the content of ``~/tmp/``. 200 201Layers of file path access rights 202--------------------------------- 203 204Each time a thread enforces a ruleset on itself, it updates its Landlock domain 205with a new layer of policy. Indeed, this complementary policy is stacked with 206the potentially other rulesets already restricting this thread. A sandboxed 207thread can then safely add more constraints to itself with a new enforced 208ruleset. 209 210One policy layer grants access to a file path if at least one of its rules 211encountered on the path grants the access. A sandboxed thread can only access 212a file path if all its enforced policy layers grant the access as well as all 213the other system access controls (e.g. filesystem DAC, other LSM policies, 214etc.). 215 216Bind mounts and OverlayFS 217------------------------- 218 219Landlock enables to restrict access to file hierarchies, which means that these 220access rights can be propagated with bind mounts (cf. 221Documentation/filesystems/sharedsubtree.rst) but not with 222Documentation/filesystems/overlayfs.rst. 223 224A bind mount mirrors a source file hierarchy to a destination. The destination 225hierarchy is then composed of the exact same files, on which Landlock rules can 226be tied, either via the source or the destination path. These rules restrict 227access when they are encountered on a path, which means that they can restrict 228access to multiple file hierarchies at the same time, whether these hierarchies 229are the result of bind mounts or not. 230 231An OverlayFS mount point consists of upper and lower layers. These layers are 232combined in a merge directory, result of the mount point. This merge hierarchy 233may include files from the upper and lower layers, but modifications performed 234on the merge hierarchy only reflects on the upper layer. From a Landlock 235policy point of view, each OverlayFS layers and merge hierarchies are 236standalone and contains their own set of files and directories, which is 237different from bind mounts. A policy restricting an OverlayFS layer will not 238restrict the resulted merged hierarchy, and vice versa. Landlock users should 239then only think about file hierarchies they want to allow access to, regardless 240of the underlying filesystem. 241 242Inheritance 243----------- 244 245Every new thread resulting from a :manpage:`clone(2)` inherits Landlock domain 246restrictions from its parent. This is similar to the seccomp inheritance (cf. 247Documentation/userspace-api/seccomp_filter.rst) or any other LSM dealing with 248task's :manpage:`credentials(7)`. For instance, one process's thread may apply 249Landlock rules to itself, but they will not be automatically applied to other 250sibling threads (unlike POSIX thread credential changes, cf. 251:manpage:`nptl(7)`). 252 253When a thread sandboxes itself, we have the guarantee that the related security 254policy will stay enforced on all this thread's descendants. This allows 255creating standalone and modular security policies per application, which will 256automatically be composed between themselves according to their runtime parent 257policies. 258 259Ptrace restrictions 260------------------- 261 262A sandboxed process has less privileges than a non-sandboxed process and must 263then be subject to additional restrictions when manipulating another process. 264To be allowed to use :manpage:`ptrace(2)` and related syscalls on a target 265process, a sandboxed process should have a subset of the target process rules, 266which means the tracee must be in a sub-domain of the tracer. 267 268Truncating files 269---------------- 270 271The operations covered by ``LANDLOCK_ACCESS_FS_WRITE_FILE`` and 272``LANDLOCK_ACCESS_FS_TRUNCATE`` both change the contents of a file and sometimes 273overlap in non-intuitive ways. It is recommended to always specify both of 274these together. 275 276A particularly surprising example is :manpage:`creat(2)`. The name suggests 277that this system call requires the rights to create and write files. However, 278it also requires the truncate right if an existing file under the same name is 279already present. 280 281It should also be noted that truncating files does not require the 282``LANDLOCK_ACCESS_FS_WRITE_FILE`` right. Apart from the :manpage:`truncate(2)` 283system call, this can also be done through :manpage:`open(2)` with the flags 284``O_RDONLY | O_TRUNC``. 285 286When opening a file, the availability of the ``LANDLOCK_ACCESS_FS_TRUNCATE`` 287right is associated with the newly created file descriptor and will be used for 288subsequent truncation attempts using :manpage:`ftruncate(2)`. The behavior is 289similar to opening a file for reading or writing, where permissions are checked 290during :manpage:`open(2)`, but not during the subsequent :manpage:`read(2)` and 291:manpage:`write(2)` calls. 292 293As a consequence, it is possible to have multiple open file descriptors for the 294same file, where one grants the right to truncate the file and the other does 295not. It is also possible to pass such file descriptors between processes, 296keeping their Landlock properties, even when these processes do not have an 297enforced Landlock ruleset. 298 299Compatibility 300============= 301 302Backward and forward compatibility 303---------------------------------- 304 305Landlock is designed to be compatible with past and future versions of the 306kernel. This is achieved thanks to the system call attributes and the 307associated bitflags, particularly the ruleset's ``handled_access_fs``. Making 308handled access right explicit enables the kernel and user space to have a clear 309contract with each other. This is required to make sure sandboxing will not 310get stricter with a system update, which could break applications. 311 312Developers can subscribe to the `Landlock mailing list 313<https://subspace.kernel.org/lists.linux.dev.html>`_ to knowingly update and 314test their applications with the latest available features. In the interest of 315users, and because they may use different kernel versions, it is strongly 316encouraged to follow a best-effort security approach by checking the Landlock 317ABI version at runtime and only enforcing the supported features. 318 319.. _landlock_abi_versions: 320 321Landlock ABI versions 322--------------------- 323 324The Landlock ABI version can be read with the sys_landlock_create_ruleset() 325system call: 326 327.. code-block:: c 328 329 int abi; 330 331 abi = landlock_create_ruleset(NULL, 0, LANDLOCK_CREATE_RULESET_VERSION); 332 if (abi < 0) { 333 switch (errno) { 334 case ENOSYS: 335 printf("Landlock is not supported by the current kernel.\n"); 336 break; 337 case EOPNOTSUPP: 338 printf("Landlock is currently disabled.\n"); 339 break; 340 } 341 return 0; 342 } 343 if (abi >= 2) { 344 printf("Landlock supports LANDLOCK_ACCESS_FS_REFER.\n"); 345 } 346 347The following kernel interfaces are implicitly supported by the first ABI 348version. Features only supported from a specific version are explicitly marked 349as such. 350 351Kernel interface 352================ 353 354Access rights 355------------- 356 357.. kernel-doc:: include/uapi/linux/landlock.h 358 :identifiers: fs_access 359 360Creating a new ruleset 361---------------------- 362 363.. kernel-doc:: security/landlock/syscalls.c 364 :identifiers: sys_landlock_create_ruleset 365 366.. kernel-doc:: include/uapi/linux/landlock.h 367 :identifiers: landlock_ruleset_attr 368 369Extending a ruleset 370------------------- 371 372.. kernel-doc:: security/landlock/syscalls.c 373 :identifiers: sys_landlock_add_rule 374 375.. kernel-doc:: include/uapi/linux/landlock.h 376 :identifiers: landlock_rule_type landlock_path_beneath_attr 377 378Enforcing a ruleset 379------------------- 380 381.. kernel-doc:: security/landlock/syscalls.c 382 :identifiers: sys_landlock_restrict_self 383 384Current limitations 385=================== 386 387Filesystem topology modification 388-------------------------------- 389 390As for file renaming and linking, a sandboxed thread cannot modify its 391filesystem topology, whether via :manpage:`mount(2)` or 392:manpage:`pivot_root(2)`. However, :manpage:`chroot(2)` calls are not denied. 393 394Special filesystems 395------------------- 396 397Access to regular files and directories can be restricted by Landlock, 398according to the handled accesses of a ruleset. However, files that do not 399come from a user-visible filesystem (e.g. pipe, socket), but can still be 400accessed through ``/proc/<pid>/fd/*``, cannot currently be explicitly 401restricted. Likewise, some special kernel filesystems such as nsfs, which can 402be accessed through ``/proc/<pid>/ns/*``, cannot currently be explicitly 403restricted. However, thanks to the `ptrace restrictions`_, access to such 404sensitive ``/proc`` files are automatically restricted according to domain 405hierarchies. Future Landlock evolutions could still enable to explicitly 406restrict such paths with dedicated ruleset flags. 407 408Ruleset layers 409-------------- 410 411There is a limit of 16 layers of stacked rulesets. This can be an issue for a 412task willing to enforce a new ruleset in complement to its 16 inherited 413rulesets. Once this limit is reached, sys_landlock_restrict_self() returns 414E2BIG. It is then strongly suggested to carefully build rulesets once in the 415life of a thread, especially for applications able to launch other applications 416that may also want to sandbox themselves (e.g. shells, container managers, 417etc.). 418 419Memory usage 420------------ 421 422Kernel memory allocated to create rulesets is accounted and can be restricted 423by the Documentation/admin-guide/cgroup-v1/memory.rst. 424 425Previous limitations 426==================== 427 428File renaming and linking (ABI < 2) 429----------------------------------- 430 431Because Landlock targets unprivileged access controls, it needs to properly 432handle composition of rules. Such property also implies rules nesting. 433Properly handling multiple layers of rulesets, each one of them able to 434restrict access to files, also implies inheritance of the ruleset restrictions 435from a parent to its hierarchy. Because files are identified and restricted by 436their hierarchy, moving or linking a file from one directory to another implies 437propagation of the hierarchy constraints, or restriction of these actions 438according to the potentially lost constraints. To protect against privilege 439escalations through renaming or linking, and for the sake of simplicity, 440Landlock previously limited linking and renaming to the same directory. 441Starting with the Landlock ABI version 2, it is now possible to securely 442control renaming and linking thanks to the new ``LANDLOCK_ACCESS_FS_REFER`` 443access right. 444 445File truncation (ABI < 3) 446------------------------- 447 448File truncation could not be denied before the third Landlock ABI, so it is 449always allowed when using a kernel that only supports the first or second ABI. 450 451Starting with the Landlock ABI version 3, it is now possible to securely control 452truncation thanks to the new ``LANDLOCK_ACCESS_FS_TRUNCATE`` access right. 453 454.. _kernel_support: 455 456Kernel support 457============== 458 459Landlock was first introduced in Linux 5.13 but it must be configured at build 460time with ``CONFIG_SECURITY_LANDLOCK=y``. Landlock must also be enabled at boot 461time as the other security modules. The list of security modules enabled by 462default is set with ``CONFIG_LSM``. The kernel configuration should then 463contains ``CONFIG_LSM=landlock,[...]`` with ``[...]`` as the list of other 464potentially useful security modules for the running system (see the 465``CONFIG_LSM`` help). 466 467If the running kernel does not have ``landlock`` in ``CONFIG_LSM``, then we can 468still enable it by adding ``lsm=landlock,[...]`` to 469Documentation/admin-guide/kernel-parameters.rst thanks to the bootloader 470configuration. 471 472Questions and answers 473===================== 474 475What about user space sandbox managers? 476--------------------------------------- 477 478Using user space process to enforce restrictions on kernel resources can lead 479to race conditions or inconsistent evaluations (i.e. `Incorrect mirroring of 480the OS code and state 481<https://www.ndss-symposium.org/ndss2003/traps-and-pitfalls-practical-problems-system-call-interposition-based-security-tools/>`_). 482 483What about namespaces and containers? 484------------------------------------- 485 486Namespaces can help create sandboxes but they are not designed for 487access-control and then miss useful features for such use case (e.g. no 488fine-grained restrictions). Moreover, their complexity can lead to security 489issues, especially when untrusted processes can manipulate them (cf. 490`Controlling access to user namespaces <https://lwn.net/Articles/673597/>`_). 491 492Additional documentation 493======================== 494 495* Documentation/security/landlock.rst 496* https://landlock.io 497 498.. Links 499.. _samples/landlock/sandboxer.c: 500 https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/samples/landlock/sandboxer.c 501