1========= 2SafeSetID 3========= 4SafeSetID is an LSM module that gates the setid family of syscalls to restrict 5UID/GID transitions from a given UID/GID to only those approved by a 6system-wide allowlist. These restrictions also prohibit the given UIDs/GIDs 7from obtaining auxiliary privileges associated with CAP_SET{U/G}ID, such as 8allowing a user to set up user namespace UID/GID mappings. 9 10 11Background 12========== 13In absence of file capabilities, processes spawned on a Linux system that need 14to switch to a different user must be spawned with CAP_SETUID privileges. 15CAP_SETUID is granted to programs running as root or those running as a non-root 16user that have been explicitly given the CAP_SETUID runtime capability. It is 17often preferable to use Linux runtime capabilities rather than file 18capabilities, since using file capabilities to run a program with elevated 19privileges opens up possible security holes since any user with access to the 20file can exec() that program to gain the elevated privileges. 21 22While it is possible to implement a tree of processes by giving full 23CAP_SET{U/G}ID capabilities, this is often at odds with the goals of running a 24tree of processes under non-root user(s) in the first place. Specifically, 25since CAP_SETUID allows changing to any user on the system, including the root 26user, it is an overpowered capability for what is needed in this scenario, 27especially since programs often only call setuid() to drop privileges to a 28lesser-privileged user -- not elevate privileges. Unfortunately, there is no 29generally feasible way in Linux to restrict the potential UIDs that a user can 30switch to through setuid() beyond allowing a switch to any user on the system. 31This SafeSetID LSM seeks to provide a solution for restricting setid 32capabilities in such a way. 33 34The main use case for this LSM is to allow a non-root program to transition to 35other untrusted uids without full blown CAP_SETUID capabilities. The non-root 36program would still need CAP_SETUID to do any kind of transition, but the 37additional restrictions imposed by this LSM would mean it is a "safer" version 38of CAP_SETUID since the non-root program cannot take advantage of CAP_SETUID to 39do any unapproved actions (e.g. setuid to uid 0 or create/enter new user 40namespace). The higher level goal is to allow for uid-based sandboxing of system 41services without having to give out CAP_SETUID all over the place just so that 42non-root programs can drop to even-lesser-privileged uids. This is especially 43relevant when one non-root daemon on the system should be allowed to spawn other 44processes as different uids, but its undesirable to give the daemon a 45basically-root-equivalent CAP_SETUID. 46 47 48Other Approaches Considered 49=========================== 50 51Solve this problem in userspace 52------------------------------- 53For candidate applications that would like to have restricted setid capabilities 54as implemented in this LSM, an alternative option would be to simply take away 55setid capabilities from the application completely and refactor the process 56spawning semantics in the application (e.g. by using a privileged helper program 57to do process spawning and UID/GID transitions). Unfortunately, there are a 58number of semantics around process spawning that would be affected by this, such 59as fork() calls where the program doesn't immediately call exec() after the 60fork(), parent processes specifying custom environment variables or command line 61args for spawned child processes, or inheritance of file handles across a 62fork()/exec(). Because of this, as solution that uses a privileged helper in 63userspace would likely be less appealing to incorporate into existing projects 64that rely on certain process-spawning semantics in Linux. 65 66Use user namespaces 67------------------- 68Another possible approach would be to run a given process tree in its own user 69namespace and give programs in the tree setid capabilities. In this way, 70programs in the tree could change to any desired UID/GID in the context of their 71own user namespace, and only approved UIDs/GIDs could be mapped back to the 72initial system user namespace, affectively preventing privilege escalation. 73Unfortunately, it is not generally feasible to use user namespaces in isolation, 74without pairing them with other namespace types, which is not always an option. 75Linux checks for capabilities based off of the user namespace that "owns" some 76entity. For example, Linux has the notion that network namespaces are owned by 77the user namespace in which they were created. A consequence of this is that 78capability checks for access to a given network namespace are done by checking 79whether a task has the given capability in the context of the user namespace 80that owns the network namespace -- not necessarily the user namespace under 81which the given task runs. Therefore spawning a process in a new user namespace 82effectively prevents it from accessing the network namespace owned by the 83initial namespace. This is a deal-breaker for any application that expects to 84retain the CAP_NET_ADMIN capability for the purpose of adjusting network 85configurations. Using user namespaces in isolation causes problems regarding 86other system interactions, including use of pid namespaces and device creation. 87 88Use an existing LSM 89------------------- 90None of the other in-tree LSMs have the capability to gate setid transitions, or 91even employ the security_task_fix_setuid hook at all. SELinux says of that hook: 92"Since setuid only affects the current process, and since the SELinux controls 93are not based on the Linux identity attributes, SELinux does not need to control 94this operation." 95 96 97Directions for use 98================== 99This LSM hooks the setid syscalls to make sure transitions are allowed if an 100applicable restriction policy is in place. Policies are configured through 101securityfs by writing to the safesetid/uid_allowlist_policy and 102safesetid/gid_allowlist_policy files at the location where securityfs is 103mounted. The format for adding a policy is '<UID>:<UID>' or '<GID>:<GID>', 104using literal numbers, and ending with a newline character such as '123:456\n'. 105Writing an empty string "" will flush the policy. Again, configuring a policy 106for a UID/GID will prevent that UID/GID from obtaining auxiliary setid 107privileges, such as allowing a user to set up user namespace UID/GID mappings. 108 109Note on GID policies and setgroups() 110==================================== 111In v5.9 we are adding support for limiting CAP_SETGID privileges as was done 112previously for CAP_SETUID. However, for compatibility with common sandboxing 113related code conventions in userspace, we currently allow arbitrary 114setgroups() calls for processes with CAP_SETGID restrictions. Until we add 115support in a future release for restricting setgroups() calls, these GID 116policies add no meaningful security. setgroups() restrictions will be enforced 117once we have the policy checking code in place, which will rely on GID policy 118configuration code added in v5.9. 119