1.. SPDX-License-Identifier: GPL-2.0
2.. Copyright © 2017-2020 Mickaël Salaün <mic@digikod.net>
3.. Copyright © 2019-2020 ANSSI
4.. Copyright © 2021 Microsoft Corporation
5
6=====================================
7Landlock: unprivileged access control
8=====================================
9
10:Author: Mickaël Salaün
11:Date: March 2021
12
13The goal of Landlock is to enable to restrict ambient rights (e.g. global
14filesystem access) for a set of processes.  Because Landlock is a stackable
15LSM, it makes possible to create safe security sandboxes as new security layers
16in addition to the existing system-wide access-controls. This kind of sandbox
17is expected to help mitigate the security impact of bugs or
18unexpected/malicious behaviors in user space applications.  Landlock empowers
19any process, including unprivileged ones, to securely restrict themselves.
20
21Landlock rules
22==============
23
24A Landlock rule describes an action on an object.  An object is currently a
25file hierarchy, and the related filesystem actions are defined with `access
26rights`_.  A set of rules is aggregated in a ruleset, which can then restrict
27the thread enforcing it, and its future children.
28
29Defining and enforcing a security policy
30----------------------------------------
31
32We first need to create the ruleset that will contain our rules.  For this
33example, the ruleset will contain rules that only allow read actions, but write
34actions will be denied.  The ruleset then needs to handle both of these kind of
35actions.
36
37.. code-block:: c
38
39    int ruleset_fd;
40    struct landlock_ruleset_attr ruleset_attr = {
41        .handled_access_fs =
42            LANDLOCK_ACCESS_FS_EXECUTE |
43            LANDLOCK_ACCESS_FS_WRITE_FILE |
44            LANDLOCK_ACCESS_FS_READ_FILE |
45            LANDLOCK_ACCESS_FS_READ_DIR |
46            LANDLOCK_ACCESS_FS_REMOVE_DIR |
47            LANDLOCK_ACCESS_FS_REMOVE_FILE |
48            LANDLOCK_ACCESS_FS_MAKE_CHAR |
49            LANDLOCK_ACCESS_FS_MAKE_DIR |
50            LANDLOCK_ACCESS_FS_MAKE_REG |
51            LANDLOCK_ACCESS_FS_MAKE_SOCK |
52            LANDLOCK_ACCESS_FS_MAKE_FIFO |
53            LANDLOCK_ACCESS_FS_MAKE_BLOCK |
54            LANDLOCK_ACCESS_FS_MAKE_SYM,
55    };
56
57    ruleset_fd = landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0);
58    if (ruleset_fd < 0) {
59        perror("Failed to create a ruleset");
60        return 1;
61    }
62
63We can now add a new rule to this ruleset thanks to the returned file
64descriptor referring to this ruleset.  The rule will only allow reading the
65file hierarchy ``/usr``.  Without another rule, write actions would then be
66denied by the ruleset.  To add ``/usr`` to the ruleset, we open it with the
67``O_PATH`` flag and fill the &struct landlock_path_beneath_attr with this file
68descriptor.
69
70.. code-block:: c
71
72    int err;
73    struct landlock_path_beneath_attr path_beneath = {
74        .allowed_access =
75            LANDLOCK_ACCESS_FS_EXECUTE |
76            LANDLOCK_ACCESS_FS_READ_FILE |
77            LANDLOCK_ACCESS_FS_READ_DIR,
78    };
79
80    path_beneath.parent_fd = open("/usr", O_PATH | O_CLOEXEC);
81    if (path_beneath.parent_fd < 0) {
82        perror("Failed to open file");
83        close(ruleset_fd);
84        return 1;
85    }
86    err = landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH,
87                            &path_beneath, 0);
88    close(path_beneath.parent_fd);
89    if (err) {
90        perror("Failed to update ruleset");
91        close(ruleset_fd);
92        return 1;
93    }
94
95We now have a ruleset with one rule allowing read access to ``/usr`` while
96denying all other handled accesses for the filesystem.  The next step is to
97restrict the current thread from gaining more privileges (e.g. thanks to a SUID
98binary).
99
100.. code-block:: c
101
102    if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0)) {
103        perror("Failed to restrict privileges");
104        close(ruleset_fd);
105        return 1;
106    }
107
108The current thread is now ready to sandbox itself with the ruleset.
109
110.. code-block:: c
111
112    if (landlock_restrict_self(ruleset_fd, 0)) {
113        perror("Failed to enforce ruleset");
114        close(ruleset_fd);
115        return 1;
116    }
117    close(ruleset_fd);
118
119If the `landlock_restrict_self` system call succeeds, the current thread is now
120restricted and this policy will be enforced on all its subsequently created
121children as well.  Once a thread is landlocked, there is no way to remove its
122security policy; only adding more restrictions is allowed.  These threads are
123now in a new Landlock domain, merge of their parent one (if any) with the new
124ruleset.
125
126Full working code can be found in `samples/landlock/sandboxer.c`_.
127
128Layers of file path access rights
129---------------------------------
130
131Each time a thread enforces a ruleset on itself, it updates its Landlock domain
132with a new layer of policy.  Indeed, this complementary policy is stacked with
133the potentially other rulesets already restricting this thread.  A sandboxed
134thread can then safely add more constraints to itself with a new enforced
135ruleset.
136
137One policy layer grants access to a file path if at least one of its rules
138encountered on the path grants the access.  A sandboxed thread can only access
139a file path if all its enforced policy layers grant the access as well as all
140the other system access controls (e.g. filesystem DAC, other LSM policies,
141etc.).
142
143Bind mounts and OverlayFS
144-------------------------
145
146Landlock enables to restrict access to file hierarchies, which means that these
147access rights can be propagated with bind mounts (cf.
148:doc:`/filesystems/sharedsubtree`) but not with :doc:`/filesystems/overlayfs`.
149
150A bind mount mirrors a source file hierarchy to a destination.  The destination
151hierarchy is then composed of the exact same files, on which Landlock rules can
152be tied, either via the source or the destination path.  These rules restrict
153access when they are encountered on a path, which means that they can restrict
154access to multiple file hierarchies at the same time, whether these hierarchies
155are the result of bind mounts or not.
156
157An OverlayFS mount point consists of upper and lower layers.  These layers are
158combined in a merge directory, result of the mount point.  This merge hierarchy
159may include files from the upper and lower layers, but modifications performed
160on the merge hierarchy only reflects on the upper layer.  From a Landlock
161policy point of view, each OverlayFS layers and merge hierarchies are
162standalone and contains their own set of files and directories, which is
163different from bind mounts.  A policy restricting an OverlayFS layer will not
164restrict the resulted merged hierarchy, and vice versa.  Landlock users should
165then only think about file hierarchies they want to allow access to, regardless
166of the underlying filesystem.
167
168Inheritance
169-----------
170
171Every new thread resulting from a :manpage:`clone(2)` inherits Landlock domain
172restrictions from its parent.  This is similar to the seccomp inheritance (cf.
173:doc:`/userspace-api/seccomp_filter`) or any other LSM dealing with task's
174:manpage:`credentials(7)`.  For instance, one process's thread may apply
175Landlock rules to itself, but they will not be automatically applied to other
176sibling threads (unlike POSIX thread credential changes, cf.
177:manpage:`nptl(7)`).
178
179When a thread sandboxes itself, we have the guarantee that the related security
180policy will stay enforced on all this thread's descendants.  This allows
181creating standalone and modular security policies per application, which will
182automatically be composed between themselves according to their runtime parent
183policies.
184
185Ptrace restrictions
186-------------------
187
188A sandboxed process has less privileges than a non-sandboxed process and must
189then be subject to additional restrictions when manipulating another process.
190To be allowed to use :manpage:`ptrace(2)` and related syscalls on a target
191process, a sandboxed process should have a subset of the target process rules,
192which means the tracee must be in a sub-domain of the tracer.
193
194Kernel interface
195================
196
197Access rights
198-------------
199
200.. kernel-doc:: include/uapi/linux/landlock.h
201    :identifiers: fs_access
202
203Creating a new ruleset
204----------------------
205
206.. kernel-doc:: security/landlock/syscalls.c
207    :identifiers: sys_landlock_create_ruleset
208
209.. kernel-doc:: include/uapi/linux/landlock.h
210    :identifiers: landlock_ruleset_attr
211
212Extending a ruleset
213-------------------
214
215.. kernel-doc:: security/landlock/syscalls.c
216    :identifiers: sys_landlock_add_rule
217
218.. kernel-doc:: include/uapi/linux/landlock.h
219    :identifiers: landlock_rule_type landlock_path_beneath_attr
220
221Enforcing a ruleset
222-------------------
223
224.. kernel-doc:: security/landlock/syscalls.c
225    :identifiers: sys_landlock_restrict_self
226
227Current limitations
228===================
229
230File renaming and linking
231-------------------------
232
233Because Landlock targets unprivileged access controls, it is needed to properly
234handle composition of rules.  Such property also implies rules nesting.
235Properly handling multiple layers of ruleset, each one of them able to restrict
236access to files, also implies to inherit the ruleset restrictions from a parent
237to its hierarchy.  Because files are identified and restricted by their
238hierarchy, moving or linking a file from one directory to another implies to
239propagate the hierarchy constraints.  To protect against privilege escalations
240through renaming or linking, and for the sake of simplicity, Landlock currently
241limits linking and renaming to the same directory.  Future Landlock evolutions
242will enable more flexibility for renaming and linking, with dedicated ruleset
243flags.
244
245Filesystem topology modification
246--------------------------------
247
248As for file renaming and linking, a sandboxed thread cannot modify its
249filesystem topology, whether via :manpage:`mount(2)` or
250:manpage:`pivot_root(2)`.  However, :manpage:`chroot(2)` calls are not denied.
251
252Special filesystems
253-------------------
254
255Access to regular files and directories can be restricted by Landlock,
256according to the handled accesses of a ruleset.  However, files that do not
257come from a user-visible filesystem (e.g. pipe, socket), but can still be
258accessed through ``/proc/<pid>/fd/*``, cannot currently be explicitly
259restricted.  Likewise, some special kernel filesystems such as nsfs, which can
260be accessed through ``/proc/<pid>/ns/*``, cannot currently be explicitly
261restricted.  However, thanks to the `ptrace restrictions`_, access to such
262sensitive ``/proc`` files are automatically restricted according to domain
263hierarchies.  Future Landlock evolutions could still enable to explicitly
264restrict such paths with dedicated ruleset flags.
265
266Ruleset layers
267--------------
268
269There is a limit of 64 layers of stacked rulesets.  This can be an issue for a
270task willing to enforce a new ruleset in complement to its 64 inherited
271rulesets.  Once this limit is reached, sys_landlock_restrict_self() returns
272E2BIG.  It is then strongly suggested to carefully build rulesets once in the
273life of a thread, especially for applications able to launch other applications
274that may also want to sandbox themselves (e.g. shells, container managers,
275etc.).
276
277Memory usage
278------------
279
280Kernel memory allocated to create rulesets is accounted and can be restricted
281by the :doc:`/admin-guide/cgroup-v1/memory`.
282
283Questions and answers
284=====================
285
286What about user space sandbox managers?
287---------------------------------------
288
289Using user space process to enforce restrictions on kernel resources can lead
290to race conditions or inconsistent evaluations (i.e. `Incorrect mirroring of
291the OS code and state
292<https://www.ndss-symposium.org/ndss2003/traps-and-pitfalls-practical-problems-system-call-interposition-based-security-tools/>`_).
293
294What about namespaces and containers?
295-------------------------------------
296
297Namespaces can help create sandboxes but they are not designed for
298access-control and then miss useful features for such use case (e.g. no
299fine-grained restrictions).  Moreover, their complexity can lead to security
300issues, especially when untrusted processes can manipulate them (cf.
301`Controlling access to user namespaces <https://lwn.net/Articles/673597/>`_).
302
303Additional documentation
304========================
305
306* :doc:`/security/landlock`
307* https://landlock.io
308
309.. Links
310.. _samples/landlock/sandboxer.c:
311   https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/samples/landlock/sandboxer.c
312