1.. SPDX-License-Identifier: GPL-2.0
2.. Copyright © 2017-2020 Mickaël Salaün <mic@digikod.net>
3.. Copyright © 2019-2020 ANSSI
4.. Copyright © 2021-2022 Microsoft Corporation
5
6=====================================
7Landlock: unprivileged access control
8=====================================
9
10:Author: Mickaël Salaün
11:Date: September 2022
12
13The goal of Landlock is to enable to restrict ambient rights (e.g. global
14filesystem access) for a set of processes.  Because Landlock is a stackable
15LSM, it makes possible to create safe security sandboxes as new security layers
16in addition to the existing system-wide access-controls. This kind of sandbox
17is expected to help mitigate the security impact of bugs or
18unexpected/malicious behaviors in user space applications.  Landlock empowers
19any process, including unprivileged ones, to securely restrict themselves.
20
21We can quickly make sure that Landlock is enabled in the running system by
22looking for "landlock: Up and running" in kernel logs (as root): ``dmesg | grep
23landlock || journalctl -kg landlock`` .  Developers can also easily check for
24Landlock support with a :ref:`related system call <landlock_abi_versions>`.  If
25Landlock is not currently supported, we need to :ref:`configure the kernel
26appropriately <kernel_support>`.
27
28Landlock rules
29==============
30
31A Landlock rule describes an action on an object.  An object is currently a
32file hierarchy, and the related filesystem actions are defined with `access
33rights`_.  A set of rules is aggregated in a ruleset, which can then restrict
34the thread enforcing it, and its future children.
35
36Defining and enforcing a security policy
37----------------------------------------
38
39We first need to define the ruleset that will contain our rules.  For this
40example, the ruleset will contain rules that only allow read actions, but write
41actions will be denied.  The ruleset then needs to handle both of these kind of
42actions.  This is required for backward and forward compatibility (i.e. the
43kernel and user space may not know each other's supported restrictions), hence
44the need to be explicit about the denied-by-default access rights.
45
46.. code-block:: c
47
48    struct landlock_ruleset_attr ruleset_attr = {
49        .handled_access_fs =
50            LANDLOCK_ACCESS_FS_EXECUTE |
51            LANDLOCK_ACCESS_FS_WRITE_FILE |
52            LANDLOCK_ACCESS_FS_READ_FILE |
53            LANDLOCK_ACCESS_FS_READ_DIR |
54            LANDLOCK_ACCESS_FS_REMOVE_DIR |
55            LANDLOCK_ACCESS_FS_REMOVE_FILE |
56            LANDLOCK_ACCESS_FS_MAKE_CHAR |
57            LANDLOCK_ACCESS_FS_MAKE_DIR |
58            LANDLOCK_ACCESS_FS_MAKE_REG |
59            LANDLOCK_ACCESS_FS_MAKE_SOCK |
60            LANDLOCK_ACCESS_FS_MAKE_FIFO |
61            LANDLOCK_ACCESS_FS_MAKE_BLOCK |
62            LANDLOCK_ACCESS_FS_MAKE_SYM |
63            LANDLOCK_ACCESS_FS_REFER,
64    };
65
66Because we may not know on which kernel version an application will be
67executed, it is safer to follow a best-effort security approach.  Indeed, we
68should try to protect users as much as possible whatever the kernel they are
69using.  To avoid binary enforcement (i.e. either all security features or
70none), we can leverage a dedicated Landlock command to get the current version
71of the Landlock ABI and adapt the handled accesses.  Let's check if we should
72remove the ``LANDLOCK_ACCESS_FS_REFER`` access right which is only supported
73starting with the second version of the ABI.
74
75.. code-block:: c
76
77    int abi;
78
79    abi = landlock_create_ruleset(NULL, 0, LANDLOCK_CREATE_RULESET_VERSION);
80    if (abi < 2) {
81        ruleset_attr.handled_access_fs &= ~LANDLOCK_ACCESS_FS_REFER;
82    }
83
84This enables to create an inclusive ruleset that will contain our rules.
85
86.. code-block:: c
87
88    int ruleset_fd;
89
90    ruleset_fd = landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0);
91    if (ruleset_fd < 0) {
92        perror("Failed to create a ruleset");
93        return 1;
94    }
95
96We can now add a new rule to this ruleset thanks to the returned file
97descriptor referring to this ruleset.  The rule will only allow reading the
98file hierarchy ``/usr``.  Without another rule, write actions would then be
99denied by the ruleset.  To add ``/usr`` to the ruleset, we open it with the
100``O_PATH`` flag and fill the &struct landlock_path_beneath_attr with this file
101descriptor.
102
103.. code-block:: c
104
105    int err;
106    struct landlock_path_beneath_attr path_beneath = {
107        .allowed_access =
108            LANDLOCK_ACCESS_FS_EXECUTE |
109            LANDLOCK_ACCESS_FS_READ_FILE |
110            LANDLOCK_ACCESS_FS_READ_DIR,
111    };
112
113    path_beneath.parent_fd = open("/usr", O_PATH | O_CLOEXEC);
114    if (path_beneath.parent_fd < 0) {
115        perror("Failed to open file");
116        close(ruleset_fd);
117        return 1;
118    }
119    err = landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH,
120                            &path_beneath, 0);
121    close(path_beneath.parent_fd);
122    if (err) {
123        perror("Failed to update ruleset");
124        close(ruleset_fd);
125        return 1;
126    }
127
128It may also be required to create rules following the same logic as explained
129for the ruleset creation, by filtering access rights according to the Landlock
130ABI version.  In this example, this is not required because
131``LANDLOCK_ACCESS_FS_REFER`` is not allowed by any rule.
132
133We now have a ruleset with one rule allowing read access to ``/usr`` while
134denying all other handled accesses for the filesystem.  The next step is to
135restrict the current thread from gaining more privileges (e.g. thanks to a SUID
136binary).
137
138.. code-block:: c
139
140    if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0)) {
141        perror("Failed to restrict privileges");
142        close(ruleset_fd);
143        return 1;
144    }
145
146The current thread is now ready to sandbox itself with the ruleset.
147
148.. code-block:: c
149
150    if (landlock_restrict_self(ruleset_fd, 0)) {
151        perror("Failed to enforce ruleset");
152        close(ruleset_fd);
153        return 1;
154    }
155    close(ruleset_fd);
156
157If the ``landlock_restrict_self`` system call succeeds, the current thread is
158now restricted and this policy will be enforced on all its subsequently created
159children as well.  Once a thread is landlocked, there is no way to remove its
160security policy; only adding more restrictions is allowed.  These threads are
161now in a new Landlock domain, merge of their parent one (if any) with the new
162ruleset.
163
164Full working code can be found in `samples/landlock/sandboxer.c`_.
165
166Good practices
167--------------
168
169It is recommended setting access rights to file hierarchy leaves as much as
170possible.  For instance, it is better to be able to have ``~/doc/`` as a
171read-only hierarchy and ``~/tmp/`` as a read-write hierarchy, compared to
172``~/`` as a read-only hierarchy and ``~/tmp/`` as a read-write hierarchy.
173Following this good practice leads to self-sufficient hierarchies that do not
174depend on their location (i.e. parent directories).  This is particularly
175relevant when we want to allow linking or renaming.  Indeed, having consistent
176access rights per directory enables to change the location of such directory
177without relying on the destination directory access rights (except those that
178are required for this operation, see ``LANDLOCK_ACCESS_FS_REFER``
179documentation).
180Having self-sufficient hierarchies also helps to tighten the required access
181rights to the minimal set of data.  This also helps avoid sinkhole directories,
182i.e.  directories where data can be linked to but not linked from.  However,
183this depends on data organization, which might not be controlled by developers.
184In this case, granting read-write access to ``~/tmp/``, instead of write-only
185access, would potentially allow to move ``~/tmp/`` to a non-readable directory
186and still keep the ability to list the content of ``~/tmp/``.
187
188Layers of file path access rights
189---------------------------------
190
191Each time a thread enforces a ruleset on itself, it updates its Landlock domain
192with a new layer of policy.  Indeed, this complementary policy is stacked with
193the potentially other rulesets already restricting this thread.  A sandboxed
194thread can then safely add more constraints to itself with a new enforced
195ruleset.
196
197One policy layer grants access to a file path if at least one of its rules
198encountered on the path grants the access.  A sandboxed thread can only access
199a file path if all its enforced policy layers grant the access as well as all
200the other system access controls (e.g. filesystem DAC, other LSM policies,
201etc.).
202
203Bind mounts and OverlayFS
204-------------------------
205
206Landlock enables to restrict access to file hierarchies, which means that these
207access rights can be propagated with bind mounts (cf.
208Documentation/filesystems/sharedsubtree.rst) but not with
209Documentation/filesystems/overlayfs.rst.
210
211A bind mount mirrors a source file hierarchy to a destination.  The destination
212hierarchy is then composed of the exact same files, on which Landlock rules can
213be tied, either via the source or the destination path.  These rules restrict
214access when they are encountered on a path, which means that they can restrict
215access to multiple file hierarchies at the same time, whether these hierarchies
216are the result of bind mounts or not.
217
218An OverlayFS mount point consists of upper and lower layers.  These layers are
219combined in a merge directory, result of the mount point.  This merge hierarchy
220may include files from the upper and lower layers, but modifications performed
221on the merge hierarchy only reflects on the upper layer.  From a Landlock
222policy point of view, each OverlayFS layers and merge hierarchies are
223standalone and contains their own set of files and directories, which is
224different from bind mounts.  A policy restricting an OverlayFS layer will not
225restrict the resulted merged hierarchy, and vice versa.  Landlock users should
226then only think about file hierarchies they want to allow access to, regardless
227of the underlying filesystem.
228
229Inheritance
230-----------
231
232Every new thread resulting from a :manpage:`clone(2)` inherits Landlock domain
233restrictions from its parent.  This is similar to the seccomp inheritance (cf.
234Documentation/userspace-api/seccomp_filter.rst) or any other LSM dealing with
235task's :manpage:`credentials(7)`.  For instance, one process's thread may apply
236Landlock rules to itself, but they will not be automatically applied to other
237sibling threads (unlike POSIX thread credential changes, cf.
238:manpage:`nptl(7)`).
239
240When a thread sandboxes itself, we have the guarantee that the related security
241policy will stay enforced on all this thread's descendants.  This allows
242creating standalone and modular security policies per application, which will
243automatically be composed between themselves according to their runtime parent
244policies.
245
246Ptrace restrictions
247-------------------
248
249A sandboxed process has less privileges than a non-sandboxed process and must
250then be subject to additional restrictions when manipulating another process.
251To be allowed to use :manpage:`ptrace(2)` and related syscalls on a target
252process, a sandboxed process should have a subset of the target process rules,
253which means the tracee must be in a sub-domain of the tracer.
254
255Compatibility
256=============
257
258Backward and forward compatibility
259----------------------------------
260
261Landlock is designed to be compatible with past and future versions of the
262kernel.  This is achieved thanks to the system call attributes and the
263associated bitflags, particularly the ruleset's ``handled_access_fs``.  Making
264handled access right explicit enables the kernel and user space to have a clear
265contract with each other.  This is required to make sure sandboxing will not
266get stricter with a system update, which could break applications.
267
268Developers can subscribe to the `Landlock mailing list
269<https://subspace.kernel.org/lists.linux.dev.html>`_ to knowingly update and
270test their applications with the latest available features.  In the interest of
271users, and because they may use different kernel versions, it is strongly
272encouraged to follow a best-effort security approach by checking the Landlock
273ABI version at runtime and only enforcing the supported features.
274
275.. _landlock_abi_versions:
276
277Landlock ABI versions
278---------------------
279
280The Landlock ABI version can be read with the sys_landlock_create_ruleset()
281system call:
282
283.. code-block:: c
284
285    int abi;
286
287    abi = landlock_create_ruleset(NULL, 0, LANDLOCK_CREATE_RULESET_VERSION);
288    if (abi < 0) {
289        switch (errno) {
290        case ENOSYS:
291            printf("Landlock is not supported by the current kernel.\n");
292            break;
293        case EOPNOTSUPP:
294            printf("Landlock is currently disabled.\n");
295            break;
296        }
297        return 0;
298    }
299    if (abi >= 2) {
300        printf("Landlock supports LANDLOCK_ACCESS_FS_REFER.\n");
301    }
302
303The following kernel interfaces are implicitly supported by the first ABI
304version.  Features only supported from a specific version are explicitly marked
305as such.
306
307Kernel interface
308================
309
310Access rights
311-------------
312
313.. kernel-doc:: include/uapi/linux/landlock.h
314    :identifiers: fs_access
315
316Creating a new ruleset
317----------------------
318
319.. kernel-doc:: security/landlock/syscalls.c
320    :identifiers: sys_landlock_create_ruleset
321
322.. kernel-doc:: include/uapi/linux/landlock.h
323    :identifiers: landlock_ruleset_attr
324
325Extending a ruleset
326-------------------
327
328.. kernel-doc:: security/landlock/syscalls.c
329    :identifiers: sys_landlock_add_rule
330
331.. kernel-doc:: include/uapi/linux/landlock.h
332    :identifiers: landlock_rule_type landlock_path_beneath_attr
333
334Enforcing a ruleset
335-------------------
336
337.. kernel-doc:: security/landlock/syscalls.c
338    :identifiers: sys_landlock_restrict_self
339
340Current limitations
341===================
342
343Filesystem topology modification
344--------------------------------
345
346As for file renaming and linking, a sandboxed thread cannot modify its
347filesystem topology, whether via :manpage:`mount(2)` or
348:manpage:`pivot_root(2)`.  However, :manpage:`chroot(2)` calls are not denied.
349
350Special filesystems
351-------------------
352
353Access to regular files and directories can be restricted by Landlock,
354according to the handled accesses of a ruleset.  However, files that do not
355come from a user-visible filesystem (e.g. pipe, socket), but can still be
356accessed through ``/proc/<pid>/fd/*``, cannot currently be explicitly
357restricted.  Likewise, some special kernel filesystems such as nsfs, which can
358be accessed through ``/proc/<pid>/ns/*``, cannot currently be explicitly
359restricted.  However, thanks to the `ptrace restrictions`_, access to such
360sensitive ``/proc`` files are automatically restricted according to domain
361hierarchies.  Future Landlock evolutions could still enable to explicitly
362restrict such paths with dedicated ruleset flags.
363
364Ruleset layers
365--------------
366
367There is a limit of 16 layers of stacked rulesets.  This can be an issue for a
368task willing to enforce a new ruleset in complement to its 16 inherited
369rulesets.  Once this limit is reached, sys_landlock_restrict_self() returns
370E2BIG.  It is then strongly suggested to carefully build rulesets once in the
371life of a thread, especially for applications able to launch other applications
372that may also want to sandbox themselves (e.g. shells, container managers,
373etc.).
374
375Memory usage
376------------
377
378Kernel memory allocated to create rulesets is accounted and can be restricted
379by the Documentation/admin-guide/cgroup-v1/memory.rst.
380
381Previous limitations
382====================
383
384File renaming and linking (ABI < 2)
385-----------------------------------
386
387Because Landlock targets unprivileged access controls, it needs to properly
388handle composition of rules.  Such property also implies rules nesting.
389Properly handling multiple layers of rulesets, each one of them able to
390restrict access to files, also implies inheritance of the ruleset restrictions
391from a parent to its hierarchy.  Because files are identified and restricted by
392their hierarchy, moving or linking a file from one directory to another implies
393propagation of the hierarchy constraints, or restriction of these actions
394according to the potentially lost constraints.  To protect against privilege
395escalations through renaming or linking, and for the sake of simplicity,
396Landlock previously limited linking and renaming to the same directory.
397Starting with the Landlock ABI version 2, it is now possible to securely
398control renaming and linking thanks to the new ``LANDLOCK_ACCESS_FS_REFER``
399access right.
400
401.. _kernel_support:
402
403Kernel support
404==============
405
406Landlock was first introduced in Linux 5.13 but it must be configured at build
407time with ``CONFIG_SECURITY_LANDLOCK=y``.  Landlock must also be enabled at boot
408time as the other security modules.  The list of security modules enabled by
409default is set with ``CONFIG_LSM``.  The kernel configuration should then
410contains ``CONFIG_LSM=landlock,[...]`` with ``[...]``  as the list of other
411potentially useful security modules for the running system (see the
412``CONFIG_LSM`` help).
413
414If the running kernel does not have ``landlock`` in ``CONFIG_LSM``, then we can
415still enable it by adding ``lsm=landlock,[...]`` to
416Documentation/admin-guide/kernel-parameters.rst thanks to the bootloader
417configuration.
418
419Questions and answers
420=====================
421
422What about user space sandbox managers?
423---------------------------------------
424
425Using user space process to enforce restrictions on kernel resources can lead
426to race conditions or inconsistent evaluations (i.e. `Incorrect mirroring of
427the OS code and state
428<https://www.ndss-symposium.org/ndss2003/traps-and-pitfalls-practical-problems-system-call-interposition-based-security-tools/>`_).
429
430What about namespaces and containers?
431-------------------------------------
432
433Namespaces can help create sandboxes but they are not designed for
434access-control and then miss useful features for such use case (e.g. no
435fine-grained restrictions).  Moreover, their complexity can lead to security
436issues, especially when untrusted processes can manipulate them (cf.
437`Controlling access to user namespaces <https://lwn.net/Articles/673597/>`_).
438
439Additional documentation
440========================
441
442* Documentation/security/landlock.rst
443* https://landlock.io
444
445.. Links
446.. _samples/landlock/sandboxer.c:
447   https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/samples/landlock/sandboxer.c
448