1af777cd1SKees Cook==================== 2af777cd1SKees CookCredentials in Linux 3af777cd1SKees Cook==================== 4af777cd1SKees Cook 5af777cd1SKees CookBy: David Howells <dhowells@redhat.com> 6af777cd1SKees Cook 7af777cd1SKees Cook.. contents:: :local: 8af777cd1SKees Cook 9af777cd1SKees CookOverview 10af777cd1SKees Cook======== 11af777cd1SKees Cook 12af777cd1SKees CookThere are several parts to the security check performed by Linux when one 13af777cd1SKees Cookobject acts upon another: 14af777cd1SKees Cook 15af777cd1SKees Cook 1. Objects. 16af777cd1SKees Cook 17af777cd1SKees Cook Objects are things in the system that may be acted upon directly by 18af777cd1SKees Cook userspace programs. Linux has a variety of actionable objects, including: 19af777cd1SKees Cook 20af777cd1SKees Cook - Tasks 21af777cd1SKees Cook - Files/inodes 22af777cd1SKees Cook - Sockets 23af777cd1SKees Cook - Message queues 24af777cd1SKees Cook - Shared memory segments 25af777cd1SKees Cook - Semaphores 26af777cd1SKees Cook - Keys 27af777cd1SKees Cook 28af777cd1SKees Cook As a part of the description of all these objects there is a set of 29af777cd1SKees Cook credentials. What's in the set depends on the type of object. 30af777cd1SKees Cook 31af777cd1SKees Cook 2. Object ownership. 32af777cd1SKees Cook 33af777cd1SKees Cook Amongst the credentials of most objects, there will be a subset that 34af777cd1SKees Cook indicates the ownership of that object. This is used for resource 35af777cd1SKees Cook accounting and limitation (disk quotas and task rlimits for example). 36af777cd1SKees Cook 37af777cd1SKees Cook In a standard UNIX filesystem, for instance, this will be defined by the 38af777cd1SKees Cook UID marked on the inode. 39af777cd1SKees Cook 40af777cd1SKees Cook 3. The objective context. 41af777cd1SKees Cook 42af777cd1SKees Cook Also amongst the credentials of those objects, there will be a subset that 43af777cd1SKees Cook indicates the 'objective context' of that object. This may or may not be 44af777cd1SKees Cook the same set as in (2) - in standard UNIX files, for instance, this is the 45af777cd1SKees Cook defined by the UID and the GID marked on the inode. 46af777cd1SKees Cook 47af777cd1SKees Cook The objective context is used as part of the security calculation that is 48af777cd1SKees Cook carried out when an object is acted upon. 49af777cd1SKees Cook 50af777cd1SKees Cook 4. Subjects. 51af777cd1SKees Cook 52af777cd1SKees Cook A subject is an object that is acting upon another object. 53af777cd1SKees Cook 54af777cd1SKees Cook Most of the objects in the system are inactive: they don't act on other 55af777cd1SKees Cook objects within the system. Processes/tasks are the obvious exception: 56af777cd1SKees Cook they do stuff; they access and manipulate things. 57af777cd1SKees Cook 58af777cd1SKees Cook Objects other than tasks may under some circumstances also be subjects. 59af777cd1SKees Cook For instance an open file may send SIGIO to a task using the UID and EUID 60af777cd1SKees Cook given to it by a task that called ``fcntl(F_SETOWN)`` upon it. In this case, 61af777cd1SKees Cook the file struct will have a subjective context too. 62af777cd1SKees Cook 63af777cd1SKees Cook 5. The subjective context. 64af777cd1SKees Cook 65af777cd1SKees Cook A subject has an additional interpretation of its credentials. A subset 66af777cd1SKees Cook of its credentials forms the 'subjective context'. The subjective context 67af777cd1SKees Cook is used as part of the security calculation that is carried out when a 68af777cd1SKees Cook subject acts. 69af777cd1SKees Cook 70af777cd1SKees Cook A Linux task, for example, has the FSUID, FSGID and the supplementary 71af777cd1SKees Cook group list for when it is acting upon a file - which are quite separate 72af777cd1SKees Cook from the real UID and GID that normally form the objective context of the 73af777cd1SKees Cook task. 74af777cd1SKees Cook 75af777cd1SKees Cook 6. Actions. 76af777cd1SKees Cook 77af777cd1SKees Cook Linux has a number of actions available that a subject may perform upon an 78af777cd1SKees Cook object. The set of actions available depends on the nature of the subject 79af777cd1SKees Cook and the object. 80af777cd1SKees Cook 81af777cd1SKees Cook Actions include reading, writing, creating and deleting files; forking or 82af777cd1SKees Cook signalling and tracing tasks. 83af777cd1SKees Cook 84af777cd1SKees Cook 7. Rules, access control lists and security calculations. 85af777cd1SKees Cook 86af777cd1SKees Cook When a subject acts upon an object, a security calculation is made. This 87af777cd1SKees Cook involves taking the subjective context, the objective context and the 88af777cd1SKees Cook action, and searching one or more sets of rules to see whether the subject 89af777cd1SKees Cook is granted or denied permission to act in the desired manner on the 90af777cd1SKees Cook object, given those contexts. 91af777cd1SKees Cook 92af777cd1SKees Cook There are two main sources of rules: 93af777cd1SKees Cook 94af777cd1SKees Cook a. Discretionary access control (DAC): 95af777cd1SKees Cook 96af777cd1SKees Cook Sometimes the object will include sets of rules as part of its 97af777cd1SKees Cook description. This is an 'Access Control List' or 'ACL'. A Linux 98af777cd1SKees Cook file may supply more than one ACL. 99af777cd1SKees Cook 100af777cd1SKees Cook A traditional UNIX file, for example, includes a permissions mask that 101af777cd1SKees Cook is an abbreviated ACL with three fixed classes of subject ('user', 102af777cd1SKees Cook 'group' and 'other'), each of which may be granted certain privileges 103af777cd1SKees Cook ('read', 'write' and 'execute' - whatever those map to for the object 104af777cd1SKees Cook in question). UNIX file permissions do not allow the arbitrary 105af777cd1SKees Cook specification of subjects, however, and so are of limited use. 106af777cd1SKees Cook 107af777cd1SKees Cook A Linux file might also sport a POSIX ACL. This is a list of rules 108af777cd1SKees Cook that grants various permissions to arbitrary subjects. 109af777cd1SKees Cook 110af777cd1SKees Cook b. Mandatory access control (MAC): 111af777cd1SKees Cook 112af777cd1SKees Cook The system as a whole may have one or more sets of rules that get 113af777cd1SKees Cook applied to all subjects and objects, regardless of their source. 114af777cd1SKees Cook SELinux and Smack are examples of this. 115af777cd1SKees Cook 116af777cd1SKees Cook In the case of SELinux and Smack, each object is given a label as part 117af777cd1SKees Cook of its credentials. When an action is requested, they take the 118af777cd1SKees Cook subject label, the object label and the action and look for a rule 119af777cd1SKees Cook that says that this action is either granted or denied. 120af777cd1SKees Cook 121af777cd1SKees Cook 122af777cd1SKees CookTypes of Credentials 123af777cd1SKees Cook==================== 124af777cd1SKees Cook 125af777cd1SKees CookThe Linux kernel supports the following types of credentials: 126af777cd1SKees Cook 127af777cd1SKees Cook 1. Traditional UNIX credentials. 128af777cd1SKees Cook 129af777cd1SKees Cook - Real User ID 130af777cd1SKees Cook - Real Group ID 131af777cd1SKees Cook 132af777cd1SKees Cook The UID and GID are carried by most, if not all, Linux objects, even if in 133af777cd1SKees Cook some cases it has to be invented (FAT or CIFS files for example, which are 134af777cd1SKees Cook derived from Windows). These (mostly) define the objective context of 135af777cd1SKees Cook that object, with tasks being slightly different in some cases. 136af777cd1SKees Cook 137af777cd1SKees Cook - Effective, Saved and FS User ID 138af777cd1SKees Cook - Effective, Saved and FS Group ID 139af777cd1SKees Cook - Supplementary groups 140af777cd1SKees Cook 141af777cd1SKees Cook These are additional credentials used by tasks only. Usually, an 142af777cd1SKees Cook EUID/EGID/GROUPS will be used as the subjective context, and real UID/GID 143af777cd1SKees Cook will be used as the objective. For tasks, it should be noted that this is 144af777cd1SKees Cook not always true. 145af777cd1SKees Cook 146af777cd1SKees Cook 2. Capabilities. 147af777cd1SKees Cook 148af777cd1SKees Cook - Set of permitted capabilities 149af777cd1SKees Cook - Set of inheritable capabilities 150af777cd1SKees Cook - Set of effective capabilities 151af777cd1SKees Cook - Capability bounding set 152af777cd1SKees Cook 153af777cd1SKees Cook These are only carried by tasks. They indicate superior capabilities 154af777cd1SKees Cook granted piecemeal to a task that an ordinary task wouldn't otherwise have. 155af777cd1SKees Cook These are manipulated implicitly by changes to the traditional UNIX 156af777cd1SKees Cook credentials, but can also be manipulated directly by the ``capset()`` 157af777cd1SKees Cook system call. 158af777cd1SKees Cook 159af777cd1SKees Cook The permitted capabilities are those caps that the process might grant 160af777cd1SKees Cook itself to its effective or permitted sets through ``capset()``. This 161af777cd1SKees Cook inheritable set might also be so constrained. 162af777cd1SKees Cook 163af777cd1SKees Cook The effective capabilities are the ones that a task is actually allowed to 164af777cd1SKees Cook make use of itself. 165af777cd1SKees Cook 166af777cd1SKees Cook The inheritable capabilities are the ones that may get passed across 167af777cd1SKees Cook ``execve()``. 168af777cd1SKees Cook 169af777cd1SKees Cook The bounding set limits the capabilities that may be inherited across 170af777cd1SKees Cook ``execve()``, especially when a binary is executed that will execute as 171af777cd1SKees Cook UID 0. 172af777cd1SKees Cook 173af777cd1SKees Cook 3. Secure management flags (securebits). 174af777cd1SKees Cook 175af777cd1SKees Cook These are only carried by tasks. These govern the way the above 176af777cd1SKees Cook credentials are manipulated and inherited over certain operations such as 177af777cd1SKees Cook execve(). They aren't used directly as objective or subjective 178af777cd1SKees Cook credentials. 179af777cd1SKees Cook 180af777cd1SKees Cook 4. Keys and keyrings. 181af777cd1SKees Cook 182af777cd1SKees Cook These are only carried by tasks. They carry and cache security tokens 183af777cd1SKees Cook that don't fit into the other standard UNIX credentials. They are for 184af777cd1SKees Cook making such things as network filesystem keys available to the file 185af777cd1SKees Cook accesses performed by processes, without the necessity of ordinary 186af777cd1SKees Cook programs having to know about security details involved. 187af777cd1SKees Cook 188af777cd1SKees Cook Keyrings are a special type of key. They carry sets of other keys and can 189af777cd1SKees Cook be searched for the desired key. Each process may subscribe to a number 190af777cd1SKees Cook of keyrings: 191af777cd1SKees Cook 192af777cd1SKees Cook Per-thread keying 193af777cd1SKees Cook Per-process keyring 194af777cd1SKees Cook Per-session keyring 195af777cd1SKees Cook 196af777cd1SKees Cook When a process accesses a key, if not already present, it will normally be 197af777cd1SKees Cook cached on one of these keyrings for future accesses to find. 198af777cd1SKees Cook 199c7f66400STom Saeger For more information on using keys, see ``Documentation/security/keys/*``. 200af777cd1SKees Cook 201af777cd1SKees Cook 5. LSM 202af777cd1SKees Cook 203af777cd1SKees Cook The Linux Security Module allows extra controls to be placed over the 204af777cd1SKees Cook operations that a task may do. Currently Linux supports several LSM 205af777cd1SKees Cook options. 206af777cd1SKees Cook 207af777cd1SKees Cook Some work by labelling the objects in a system and then applying sets of 208af777cd1SKees Cook rules (policies) that say what operations a task with one label may do to 209af777cd1SKees Cook an object with another label. 210af777cd1SKees Cook 211af777cd1SKees Cook 6. AF_KEY 212af777cd1SKees Cook 213af777cd1SKees Cook This is a socket-based approach to credential management for networking 214af777cd1SKees Cook stacks [RFC 2367]. It isn't discussed by this document as it doesn't 215af777cd1SKees Cook interact directly with task and file credentials; rather it keeps system 216af777cd1SKees Cook level credentials. 217af777cd1SKees Cook 218af777cd1SKees Cook 219af777cd1SKees CookWhen a file is opened, part of the opening task's subjective context is 220af777cd1SKees Cookrecorded in the file struct created. This allows operations using that file 221af777cd1SKees Cookstruct to use those credentials instead of the subjective context of the task 222af777cd1SKees Cookthat issued the operation. An example of this would be a file opened on a 223af777cd1SKees Cooknetwork filesystem where the credentials of the opened file should be presented 224af777cd1SKees Cookto the server, regardless of who is actually doing a read or a write upon it. 225af777cd1SKees Cook 226af777cd1SKees Cook 227af777cd1SKees CookFile Markings 228af777cd1SKees Cook============= 229af777cd1SKees Cook 230af777cd1SKees CookFiles on disk or obtained over the network may have annotations that form the 231af777cd1SKees Cookobjective security context of that file. Depending on the type of filesystem, 232af777cd1SKees Cookthis may include one or more of the following: 233af777cd1SKees Cook 234af777cd1SKees Cook * UNIX UID, GID, mode; 235af777cd1SKees Cook * Windows user ID; 236af777cd1SKees Cook * Access control list; 237af777cd1SKees Cook * LSM security label; 238af777cd1SKees Cook * UNIX exec privilege escalation bits (SUID/SGID); 239af777cd1SKees Cook * File capabilities exec privilege escalation bits. 240af777cd1SKees Cook 241af777cd1SKees CookThese are compared to the task's subjective security context, and certain 242af777cd1SKees Cookoperations allowed or disallowed as a result. In the case of execve(), the 243af777cd1SKees Cookprivilege escalation bits come into play, and may allow the resulting process 244af777cd1SKees Cookextra privileges, based on the annotations on the executable file. 245af777cd1SKees Cook 246af777cd1SKees Cook 247af777cd1SKees CookTask Credentials 248af777cd1SKees Cook================ 249af777cd1SKees Cook 250af777cd1SKees CookIn Linux, all of a task's credentials are held in (uid, gid) or through 251af777cd1SKees Cook(groups, keys, LSM security) a refcounted structure of type 'struct cred'. 252af777cd1SKees CookEach task points to its credentials by a pointer called 'cred' in its 253af777cd1SKees Cooktask_struct. 254af777cd1SKees Cook 255af777cd1SKees CookOnce a set of credentials has been prepared and committed, it may not be 256af777cd1SKees Cookchanged, barring the following exceptions: 257af777cd1SKees Cook 258af777cd1SKees Cook 1. its reference count may be changed; 259af777cd1SKees Cook 260af777cd1SKees Cook 2. the reference count on the group_info struct it points to may be changed; 261af777cd1SKees Cook 262af777cd1SKees Cook 3. the reference count on the security data it points to may be changed; 263af777cd1SKees Cook 264af777cd1SKees Cook 4. the reference count on any keyrings it points to may be changed; 265af777cd1SKees Cook 266af777cd1SKees Cook 5. any keyrings it points to may be revoked, expired or have their security 267af777cd1SKees Cook attributes changed; and 268af777cd1SKees Cook 269af777cd1SKees Cook 6. the contents of any keyrings to which it points may be changed (the whole 270af777cd1SKees Cook point of keyrings being a shared set of credentials, modifiable by anyone 271af777cd1SKees Cook with appropriate access). 272af777cd1SKees Cook 273af777cd1SKees CookTo alter anything in the cred struct, the copy-and-replace principle must be 274af777cd1SKees Cookadhered to. First take a copy, then alter the copy and then use RCU to change 275af777cd1SKees Cookthe task pointer to make it point to the new copy. There are wrappers to aid 276af777cd1SKees Cookwith this (see below). 277af777cd1SKees Cook 278af777cd1SKees CookA task may only alter its _own_ credentials; it is no longer permitted for a 279af777cd1SKees Cooktask to alter another's credentials. This means the ``capset()`` system call 280af777cd1SKees Cookis no longer permitted to take any PID other than the one of the current 281af777cd1SKees Cookprocess. Also ``keyctl_instantiate()`` and ``keyctl_negate()`` functions no 282af777cd1SKees Cooklonger permit attachment to process-specific keyrings in the requesting 283af777cd1SKees Cookprocess as the instantiating process may need to create them. 284af777cd1SKees Cook 285af777cd1SKees Cook 286af777cd1SKees CookImmutable Credentials 287af777cd1SKees Cook--------------------- 288af777cd1SKees Cook 289af777cd1SKees CookOnce a set of credentials has been made public (by calling ``commit_creds()`` 290af777cd1SKees Cookfor example), it must be considered immutable, barring two exceptions: 291af777cd1SKees Cook 292af777cd1SKees Cook 1. The reference count may be altered. 293af777cd1SKees Cook 294806654a9SWill Deacon 2. While the keyring subscriptions of a set of credentials may not be 295af777cd1SKees Cook changed, the keyrings subscribed to may have their contents altered. 296af777cd1SKees Cook 297af777cd1SKees CookTo catch accidental credential alteration at compile time, struct task_struct 298af777cd1SKees Cookhas _const_ pointers to its credential sets, as does struct file. Furthermore, 299af777cd1SKees Cookcertain functions such as ``get_cred()`` and ``put_cred()`` operate on const 300af777cd1SKees Cookpointers, thus rendering casts unnecessary, but require to temporarily ditch 301af777cd1SKees Cookthe const qualification to be able to alter the reference count. 302af777cd1SKees Cook 303af777cd1SKees Cook 304af777cd1SKees CookAccessing Task Credentials 305af777cd1SKees Cook-------------------------- 306af777cd1SKees Cook 307af777cd1SKees CookA task being able to alter only its own credentials permits the current process 308af777cd1SKees Cookto read or replace its own credentials without the need for any form of locking 309af777cd1SKees Cook-- which simplifies things greatly. It can just call:: 310af777cd1SKees Cook 311af777cd1SKees Cook const struct cred *current_cred() 312af777cd1SKees Cook 313af777cd1SKees Cookto get a pointer to its credentials structure, and it doesn't have to release 314af777cd1SKees Cookit afterwards. 315af777cd1SKees Cook 316af777cd1SKees CookThere are convenience wrappers for retrieving specific aspects of a task's 317af777cd1SKees Cookcredentials (the value is simply returned in each case):: 318af777cd1SKees Cook 319af777cd1SKees Cook uid_t current_uid(void) Current's real UID 320af777cd1SKees Cook gid_t current_gid(void) Current's real GID 321af777cd1SKees Cook uid_t current_euid(void) Current's effective UID 322af777cd1SKees Cook gid_t current_egid(void) Current's effective GID 323af777cd1SKees Cook uid_t current_fsuid(void) Current's file access UID 324af777cd1SKees Cook gid_t current_fsgid(void) Current's file access GID 325af777cd1SKees Cook kernel_cap_t current_cap(void) Current's effective capabilities 326af777cd1SKees Cook struct user_struct *current_user(void) Current's user account 327af777cd1SKees Cook 328af777cd1SKees CookThere are also convenience wrappers for retrieving specific associated pairs of 329af777cd1SKees Cooka task's credentials:: 330af777cd1SKees Cook 331af777cd1SKees Cook void current_uid_gid(uid_t *, gid_t *); 332af777cd1SKees Cook void current_euid_egid(uid_t *, gid_t *); 333af777cd1SKees Cook void current_fsuid_fsgid(uid_t *, gid_t *); 334af777cd1SKees Cook 335af777cd1SKees Cookwhich return these pairs of values through their arguments after retrieving 336af777cd1SKees Cookthem from the current task's credentials. 337af777cd1SKees Cook 338af777cd1SKees Cook 339af777cd1SKees CookIn addition, there is a function for obtaining a reference on the current 340af777cd1SKees Cookprocess's current set of credentials:: 341af777cd1SKees Cook 342af777cd1SKees Cook const struct cred *get_current_cred(void); 343af777cd1SKees Cook 344af777cd1SKees Cookand functions for getting references to one of the credentials that don't 345af777cd1SKees Cookactually live in struct cred:: 346af777cd1SKees Cook 347af777cd1SKees Cook struct user_struct *get_current_user(void); 348af777cd1SKees Cook struct group_info *get_current_groups(void); 349af777cd1SKees Cook 350af777cd1SKees Cookwhich get references to the current process's user accounting structure and 351af777cd1SKees Cooksupplementary groups list respectively. 352af777cd1SKees Cook 353af777cd1SKees CookOnce a reference has been obtained, it must be released with ``put_cred()``, 354af777cd1SKees Cook``free_uid()`` or ``put_group_info()`` as appropriate. 355af777cd1SKees Cook 356af777cd1SKees Cook 357af777cd1SKees CookAccessing Another Task's Credentials 358af777cd1SKees Cook------------------------------------ 359af777cd1SKees Cook 360806654a9SWill DeaconWhile a task may access its own credentials without the need for locking, the 361af777cd1SKees Cooksame is not true of a task wanting to access another task's credentials. It 362af777cd1SKees Cookmust use the RCU read lock and ``rcu_dereference()``. 363af777cd1SKees Cook 364af777cd1SKees CookThe ``rcu_dereference()`` is wrapped by:: 365af777cd1SKees Cook 366af777cd1SKees Cook const struct cred *__task_cred(struct task_struct *task); 367af777cd1SKees Cook 368af777cd1SKees CookThis should be used inside the RCU read lock, as in the following example:: 369af777cd1SKees Cook 370af777cd1SKees Cook void foo(struct task_struct *t, struct foo_data *f) 371af777cd1SKees Cook { 372af777cd1SKees Cook const struct cred *tcred; 373af777cd1SKees Cook ... 374af777cd1SKees Cook rcu_read_lock(); 375af777cd1SKees Cook tcred = __task_cred(t); 376af777cd1SKees Cook f->uid = tcred->uid; 377af777cd1SKees Cook f->gid = tcred->gid; 378af777cd1SKees Cook f->groups = get_group_info(tcred->groups); 379af777cd1SKees Cook rcu_read_unlock(); 380af777cd1SKees Cook ... 381af777cd1SKees Cook } 382af777cd1SKees Cook 383af777cd1SKees CookShould it be necessary to hold another task's credentials for a long period of 384806654a9SWill Deacontime, and possibly to sleep while doing so, then the caller should get a 385af777cd1SKees Cookreference on them using:: 386af777cd1SKees Cook 387af777cd1SKees Cook const struct cred *get_task_cred(struct task_struct *task); 388af777cd1SKees Cook 389af777cd1SKees CookThis does all the RCU magic inside of it. The caller must call put_cred() on 390af777cd1SKees Cookthe credentials so obtained when they're finished with. 391af777cd1SKees Cook 392af777cd1SKees Cook.. note:: 393af777cd1SKees Cook The result of ``__task_cred()`` should not be passed directly to 394af777cd1SKees Cook ``get_cred()`` as this may race with ``commit_cred()``. 395af777cd1SKees Cook 396af777cd1SKees CookThere are a couple of convenience functions to access bits of another task's 397af777cd1SKees Cookcredentials, hiding the RCU magic from the caller:: 398af777cd1SKees Cook 399af777cd1SKees Cook uid_t task_uid(task) Task's real UID 400af777cd1SKees Cook uid_t task_euid(task) Task's effective UID 401af777cd1SKees Cook 402af777cd1SKees CookIf the caller is holding the RCU read lock at the time anyway, then:: 403af777cd1SKees Cook 404af777cd1SKees Cook __task_cred(task)->uid 405af777cd1SKees Cook __task_cred(task)->euid 406af777cd1SKees Cook 407af777cd1SKees Cookshould be used instead. Similarly, if multiple aspects of a task's credentials 408af777cd1SKees Cookneed to be accessed, RCU read lock should be used, ``__task_cred()`` called, 409af777cd1SKees Cookthe result stored in a temporary pointer and then the credential aspects called 410af777cd1SKees Cookfrom that before dropping the lock. This prevents the potentially expensive 411af777cd1SKees CookRCU magic from being invoked multiple times. 412af777cd1SKees Cook 413af777cd1SKees CookShould some other single aspect of another task's credentials need to be 414af777cd1SKees Cookaccessed, then this can be used:: 415af777cd1SKees Cook 416af777cd1SKees Cook task_cred_xxx(task, member) 417af777cd1SKees Cook 418af777cd1SKees Cookwhere 'member' is a non-pointer member of the cred struct. For instance:: 419af777cd1SKees Cook 420af777cd1SKees Cook uid_t task_cred_xxx(task, suid); 421af777cd1SKees Cook 422af777cd1SKees Cookwill retrieve 'struct cred::suid' from the task, doing the appropriate RCU 423af777cd1SKees Cookmagic. This may not be used for pointer members as what they point to may 424af777cd1SKees Cookdisappear the moment the RCU read lock is dropped. 425af777cd1SKees Cook 426af777cd1SKees Cook 427af777cd1SKees CookAltering Credentials 428af777cd1SKees Cook-------------------- 429af777cd1SKees Cook 430af777cd1SKees CookAs previously mentioned, a task may only alter its own credentials, and may not 431af777cd1SKees Cookalter those of another task. This means that it doesn't need to use any 432af777cd1SKees Cooklocking to alter its own credentials. 433af777cd1SKees Cook 434af777cd1SKees CookTo alter the current process's credentials, a function should first prepare a 435af777cd1SKees Cooknew set of credentials by calling:: 436af777cd1SKees Cook 437af777cd1SKees Cook struct cred *prepare_creds(void); 438af777cd1SKees Cook 439af777cd1SKees Cookthis locks current->cred_replace_mutex and then allocates and constructs a 440af777cd1SKees Cookduplicate of the current process's credentials, returning with the mutex still 441af777cd1SKees Cookheld if successful. It returns NULL if not successful (out of memory). 442af777cd1SKees Cook 443af777cd1SKees CookThe mutex prevents ``ptrace()`` from altering the ptrace state of a process 444806654a9SWill Deaconwhile security checks on credentials construction and changing is taking place 445af777cd1SKees Cookas the ptrace state may alter the outcome, particularly in the case of 446af777cd1SKees Cook``execve()``. 447af777cd1SKees Cook 448af777cd1SKees CookThe new credentials set should be altered appropriately, and any security 449af777cd1SKees Cookchecks and hooks done. Both the current and the proposed sets of credentials 450af777cd1SKees Cookare available for this purpose as current_cred() will return the current set 451af777cd1SKees Cookstill at this point. 452af777cd1SKees Cook 4530b345d72SNeilBrownWhen replacing the group list, the new list must be sorted before it 4540b345d72SNeilBrownis added to the credential, as a binary search is used to test for 455*4d010d14SPuranjay Mohanmembership. In practice, this means groups_sort() should be 456*4d010d14SPuranjay Mohancalled before set_groups() or set_current_groups(). 457*4d010d14SPuranjay Mohangroups_sort() must not be called on a ``struct group_list`` which 4580b345d72SNeilBrownis shared as it may permute elements as part of the sorting process 4590b345d72SNeilBrowneven if the array is already sorted. 460af777cd1SKees Cook 461af777cd1SKees CookWhen the credential set is ready, it should be committed to the current process 462af777cd1SKees Cookby calling:: 463af777cd1SKees Cook 464af777cd1SKees Cook int commit_creds(struct cred *new); 465af777cd1SKees Cook 466af777cd1SKees CookThis will alter various aspects of the credentials and the process, giving the 467af777cd1SKees CookLSM a chance to do likewise, then it will use ``rcu_assign_pointer()`` to 468af777cd1SKees Cookactually commit the new credentials to ``current->cred``, it will release 469af777cd1SKees Cook``current->cred_replace_mutex`` to allow ``ptrace()`` to take place, and it 470af777cd1SKees Cookwill notify the scheduler and others of the changes. 471af777cd1SKees Cook 472af777cd1SKees CookThis function is guaranteed to return 0, so that it can be tail-called at the 473af777cd1SKees Cookend of such functions as ``sys_setresuid()``. 474af777cd1SKees Cook 475af777cd1SKees CookNote that this function consumes the caller's reference to the new credentials. 476af777cd1SKees CookThe caller should _not_ call ``put_cred()`` on the new credentials afterwards. 477af777cd1SKees Cook 478af777cd1SKees CookFurthermore, once this function has been called on a new set of credentials, 479af777cd1SKees Cookthose credentials may _not_ be changed further. 480af777cd1SKees Cook 481af777cd1SKees Cook 482af777cd1SKees CookShould the security checks fail or some other error occur after 483af777cd1SKees Cook``prepare_creds()`` has been called, then the following function should be 484af777cd1SKees Cookinvoked:: 485af777cd1SKees Cook 486af777cd1SKees Cook void abort_creds(struct cred *new); 487af777cd1SKees Cook 488af777cd1SKees CookThis releases the lock on ``current->cred_replace_mutex`` that 489af777cd1SKees Cook``prepare_creds()`` got and then releases the new credentials. 490af777cd1SKees Cook 491af777cd1SKees Cook 492af777cd1SKees CookA typical credentials alteration function would look something like this:: 493af777cd1SKees Cook 494af777cd1SKees Cook int alter_suid(uid_t suid) 495af777cd1SKees Cook { 496af777cd1SKees Cook struct cred *new; 497af777cd1SKees Cook int ret; 498af777cd1SKees Cook 499af777cd1SKees Cook new = prepare_creds(); 500af777cd1SKees Cook if (!new) 501af777cd1SKees Cook return -ENOMEM; 502af777cd1SKees Cook 503af777cd1SKees Cook new->suid = suid; 504af777cd1SKees Cook ret = security_alter_suid(new); 505af777cd1SKees Cook if (ret < 0) { 506af777cd1SKees Cook abort_creds(new); 507af777cd1SKees Cook return ret; 508af777cd1SKees Cook } 509af777cd1SKees Cook 510af777cd1SKees Cook return commit_creds(new); 511af777cd1SKees Cook } 512af777cd1SKees Cook 513af777cd1SKees Cook 514af777cd1SKees CookManaging Credentials 515af777cd1SKees Cook-------------------- 516af777cd1SKees Cook 517af777cd1SKees CookThere are some functions to help manage credentials: 518af777cd1SKees Cook 519af777cd1SKees Cook - ``void put_cred(const struct cred *cred);`` 520af777cd1SKees Cook 521af777cd1SKees Cook This releases a reference to the given set of credentials. If the 522af777cd1SKees Cook reference count reaches zero, the credentials will be scheduled for 523af777cd1SKees Cook destruction by the RCU system. 524af777cd1SKees Cook 525af777cd1SKees Cook - ``const struct cred *get_cred(const struct cred *cred);`` 526af777cd1SKees Cook 527af777cd1SKees Cook This gets a reference on a live set of credentials, returning a pointer to 528af777cd1SKees Cook that set of credentials. 529af777cd1SKees Cook 530af777cd1SKees Cook - ``struct cred *get_new_cred(struct cred *cred);`` 531af777cd1SKees Cook 532af777cd1SKees Cook This gets a reference on a set of credentials that is under construction 533af777cd1SKees Cook and is thus still mutable, returning a pointer to that set of credentials. 534af777cd1SKees Cook 535af777cd1SKees Cook 536af777cd1SKees CookOpen File Credentials 537af777cd1SKees Cook===================== 538af777cd1SKees Cook 539af777cd1SKees CookWhen a new file is opened, a reference is obtained on the opening task's 540af777cd1SKees Cookcredentials and this is attached to the file struct as ``f_cred`` in place of 541af777cd1SKees Cook``f_uid`` and ``f_gid``. Code that used to access ``file->f_uid`` and 542af777cd1SKees Cook``file->f_gid`` should now access ``file->f_cred->fsuid`` and 543af777cd1SKees Cook``file->f_cred->fsgid``. 544af777cd1SKees Cook 545af777cd1SKees CookIt is safe to access ``f_cred`` without the use of RCU or locking because the 546af777cd1SKees Cookpointer will not change over the lifetime of the file struct, and nor will the 547af777cd1SKees Cookcontents of the cred struct pointed to, barring the exceptions listed above 548af777cd1SKees Cook(see the Task Credentials section). 549af777cd1SKees Cook 5507303515aSKees CookTo avoid "confused deputy" privilege escalation attacks, access control checks 5517303515aSKees Cookduring subsequent operations on an opened file should use these credentials 5527303515aSKees Cookinstead of "current"'s credentials, as the file may have been passed to a more 5537303515aSKees Cookprivileged process. 554af777cd1SKees Cook 555af777cd1SKees CookOverriding the VFS's Use of Credentials 556af777cd1SKees Cook======================================= 557af777cd1SKees Cook 558af777cd1SKees CookUnder some circumstances it is desirable to override the credentials used by 559af777cd1SKees Cookthe VFS, and that can be done by calling into such as ``vfs_mkdir()`` with a 560af777cd1SKees Cookdifferent set of credentials. This is done in the following places: 561af777cd1SKees Cook 562af777cd1SKees Cook * ``sys_faccessat()``. 563af777cd1SKees Cook * ``do_coredump()``. 564af777cd1SKees Cook * nfs4recover.c. 565