1d74802adSMauro Carvalho Chehab.. SPDX-License-Identifier: GPL-2.0 2d74802adSMauro Carvalho Chehab 3e0484344SDavid Howells=================================== 4e0484344SDavid HowellsCache on Already Mounted Filesystem 5e0484344SDavid Howells=================================== 6d74802adSMauro Carvalho Chehab 7d74802adSMauro Carvalho Chehab.. Contents: 8d74802adSMauro Carvalho Chehab 9d74802adSMauro Carvalho Chehab (*) Overview. 10d74802adSMauro Carvalho Chehab 11d74802adSMauro Carvalho Chehab (*) Requirements. 12d74802adSMauro Carvalho Chehab 13d74802adSMauro Carvalho Chehab (*) Configuration. 14d74802adSMauro Carvalho Chehab 15d74802adSMauro Carvalho Chehab (*) Starting the cache. 16d74802adSMauro Carvalho Chehab 17d74802adSMauro Carvalho Chehab (*) Things to avoid. 18d74802adSMauro Carvalho Chehab 19d74802adSMauro Carvalho Chehab (*) Cache culling. 20d74802adSMauro Carvalho Chehab 21d74802adSMauro Carvalho Chehab (*) Cache structure. 22d74802adSMauro Carvalho Chehab 23d74802adSMauro Carvalho Chehab (*) Security model and SELinux. 24d74802adSMauro Carvalho Chehab 25d74802adSMauro Carvalho Chehab (*) A note on security. 26d74802adSMauro Carvalho Chehab 27d74802adSMauro Carvalho Chehab (*) Statistical information. 28d74802adSMauro Carvalho Chehab 29d74802adSMauro Carvalho Chehab (*) Debugging. 30d74802adSMauro Carvalho Chehab 3199302ebdSJeffle Xu (*) On-demand Read. 32d74802adSMauro Carvalho Chehab 33d74802adSMauro Carvalho Chehab 34d74802adSMauro Carvalho ChehabOverview 35d74802adSMauro Carvalho Chehab======== 36d74802adSMauro Carvalho Chehab 37d74802adSMauro Carvalho ChehabCacheFiles is a caching backend that's meant to use as a cache a directory on 38d74802adSMauro Carvalho Chehaban already mounted filesystem of a local type (such as Ext3). 39d74802adSMauro Carvalho Chehab 40d74802adSMauro Carvalho ChehabCacheFiles uses a userspace daemon to do some of the cache management - such as 41d74802adSMauro Carvalho Chehabreaping stale nodes and culling. This is called cachefilesd and lives in 42d74802adSMauro Carvalho Chehab/sbin. 43d74802adSMauro Carvalho Chehab 44d74802adSMauro Carvalho ChehabThe filesystem and data integrity of the cache are only as good as those of the 45d74802adSMauro Carvalho Chehabfilesystem providing the backing services. Note that CacheFiles does not 46d74802adSMauro Carvalho Chehabattempt to journal anything since the journalling interfaces of the various 47d74802adSMauro Carvalho Chehabfilesystems are very specific in nature. 48d74802adSMauro Carvalho Chehab 49d74802adSMauro Carvalho ChehabCacheFiles creates a misc character device - "/dev/cachefiles" - that is used 50d74802adSMauro Carvalho Chehabto communication with the daemon. Only one thing may have this open at once, 51d74802adSMauro Carvalho Chehaband while it is open, a cache is at least partially in existence. The daemon 52d74802adSMauro Carvalho Chehabopens this and sends commands down it to control the cache. 53d74802adSMauro Carvalho Chehab 54d74802adSMauro Carvalho ChehabCacheFiles is currently limited to a single cache. 55d74802adSMauro Carvalho Chehab 56d74802adSMauro Carvalho ChehabCacheFiles attempts to maintain at least a certain percentage of free space on 57d74802adSMauro Carvalho Chehabthe filesystem, shrinking the cache by culling the objects it contains to make 58d74802adSMauro Carvalho Chehabspace if necessary - see the "Cache Culling" section. This means it can be 59d74802adSMauro Carvalho Chehabplaced on the same medium as a live set of data, and will expand to make use of 60d74802adSMauro Carvalho Chehabspare space and automatically contract when the set of data requires more 61d74802adSMauro Carvalho Chehabspace. 62d74802adSMauro Carvalho Chehab 63d74802adSMauro Carvalho Chehab 64d74802adSMauro Carvalho Chehab 65d74802adSMauro Carvalho ChehabRequirements 66d74802adSMauro Carvalho Chehab============ 67d74802adSMauro Carvalho Chehab 68d74802adSMauro Carvalho ChehabThe use of CacheFiles and its daemon requires the following features to be 69d74802adSMauro Carvalho Chehabavailable in the system and in the cache filesystem: 70d74802adSMauro Carvalho Chehab 71d74802adSMauro Carvalho Chehab - dnotify. 72d74802adSMauro Carvalho Chehab 73d74802adSMauro Carvalho Chehab - extended attributes (xattrs). 74d74802adSMauro Carvalho Chehab 75d74802adSMauro Carvalho Chehab - openat() and friends. 76d74802adSMauro Carvalho Chehab 77d74802adSMauro Carvalho Chehab - bmap() support on files in the filesystem (FIBMAP ioctl). 78d74802adSMauro Carvalho Chehab 79d74802adSMauro Carvalho Chehab - The use of bmap() to detect a partial page at the end of the file. 80d74802adSMauro Carvalho Chehab 81d74802adSMauro Carvalho ChehabIt is strongly recommended that the "dir_index" option is enabled on Ext3 82d74802adSMauro Carvalho Chehabfilesystems being used as a cache. 83d74802adSMauro Carvalho Chehab 84d74802adSMauro Carvalho Chehab 85d74802adSMauro Carvalho ChehabConfiguration 86d74802adSMauro Carvalho Chehab============= 87d74802adSMauro Carvalho Chehab 88d74802adSMauro Carvalho ChehabThe cache is configured by a script in /etc/cachefilesd.conf. These commands 89d74802adSMauro Carvalho Chehabset up cache ready for use. The following script commands are available: 90d74802adSMauro Carvalho Chehab 91d74802adSMauro Carvalho Chehab brun <N>%, bcull <N>%, bstop <N>%, frun <N>%, fcull <N>%, fstop <N>% 92d74802adSMauro Carvalho Chehab Configure the culling limits. Optional. See the section on culling 93d74802adSMauro Carvalho Chehab The defaults are 7% (run), 5% (cull) and 1% (stop) respectively. 94d74802adSMauro Carvalho Chehab 95d74802adSMauro Carvalho Chehab The commands beginning with a 'b' are file space (block) limits, those 96d74802adSMauro Carvalho Chehab beginning with an 'f' are file count limits. 97d74802adSMauro Carvalho Chehab 98d74802adSMauro Carvalho Chehab dir <path> 99d74802adSMauro Carvalho Chehab Specify the directory containing the root of the cache. Mandatory. 100d74802adSMauro Carvalho Chehab 101d74802adSMauro Carvalho Chehab tag <name> 102d74802adSMauro Carvalho Chehab Specify a tag to FS-Cache to use in distinguishing multiple caches. 103d74802adSMauro Carvalho Chehab Optional. The default is "CacheFiles". 104d74802adSMauro Carvalho Chehab 105d74802adSMauro Carvalho Chehab debug <mask> 106d74802adSMauro Carvalho Chehab Specify a numeric bitmask to control debugging in the kernel module. 107d74802adSMauro Carvalho Chehab Optional. The default is zero (all off). The following values can be 108d74802adSMauro Carvalho Chehab OR'd into the mask to collect various information: 109d74802adSMauro Carvalho Chehab 110d74802adSMauro Carvalho Chehab == ================================================= 111d74802adSMauro Carvalho Chehab 1 Turn on trace of function entry (_enter() macros) 112d74802adSMauro Carvalho Chehab 2 Turn on trace of function exit (_leave() macros) 113d74802adSMauro Carvalho Chehab 4 Turn on trace of internal debug points (_debug()) 114d74802adSMauro Carvalho Chehab == ================================================= 115d74802adSMauro Carvalho Chehab 116d74802adSMauro Carvalho Chehab This mask can also be set through sysfs, eg:: 117d74802adSMauro Carvalho Chehab 118d74802adSMauro Carvalho Chehab echo 5 >/sys/modules/cachefiles/parameters/debug 119d74802adSMauro Carvalho Chehab 120d74802adSMauro Carvalho Chehab 121d74802adSMauro Carvalho ChehabStarting the Cache 122d74802adSMauro Carvalho Chehab================== 123d74802adSMauro Carvalho Chehab 124d74802adSMauro Carvalho ChehabThe cache is started by running the daemon. The daemon opens the cache device, 125d74802adSMauro Carvalho Chehabconfigures the cache and tells it to begin caching. At that point the cache 126d74802adSMauro Carvalho Chehabbinds to fscache and the cache becomes live. 127d74802adSMauro Carvalho Chehab 128d74802adSMauro Carvalho ChehabThe daemon is run as follows:: 129d74802adSMauro Carvalho Chehab 130d74802adSMauro Carvalho Chehab /sbin/cachefilesd [-d]* [-s] [-n] [-f <configfile>] 131d74802adSMauro Carvalho Chehab 132d74802adSMauro Carvalho ChehabThe flags are: 133d74802adSMauro Carvalho Chehab 134d74802adSMauro Carvalho Chehab ``-d`` 135d74802adSMauro Carvalho Chehab Increase the debugging level. This can be specified multiple times and 136d74802adSMauro Carvalho Chehab is cumulative with itself. 137d74802adSMauro Carvalho Chehab 138d74802adSMauro Carvalho Chehab ``-s`` 139d74802adSMauro Carvalho Chehab Send messages to stderr instead of syslog. 140d74802adSMauro Carvalho Chehab 141d74802adSMauro Carvalho Chehab ``-n`` 142d74802adSMauro Carvalho Chehab Don't daemonise and go into background. 143d74802adSMauro Carvalho Chehab 144d74802adSMauro Carvalho Chehab ``-f <configfile>`` 145d74802adSMauro Carvalho Chehab Use an alternative configuration file rather than the default one. 146d74802adSMauro Carvalho Chehab 147d74802adSMauro Carvalho Chehab 148d74802adSMauro Carvalho ChehabThings to Avoid 149d74802adSMauro Carvalho Chehab=============== 150d74802adSMauro Carvalho Chehab 151d74802adSMauro Carvalho ChehabDo not mount other things within the cache as this will cause problems. The 152d74802adSMauro Carvalho Chehabkernel module contains its own very cut-down path walking facility that ignores 153d74802adSMauro Carvalho Chehabmountpoints, but the daemon can't avoid them. 154d74802adSMauro Carvalho Chehab 155d74802adSMauro Carvalho ChehabDo not create, rename or unlink files and directories in the cache while the 156d74802adSMauro Carvalho Chehabcache is active, as this may cause the state to become uncertain. 157d74802adSMauro Carvalho Chehab 158d74802adSMauro Carvalho ChehabRenaming files in the cache might make objects appear to be other objects (the 159d74802adSMauro Carvalho Chehabfilename is part of the lookup key). 160d74802adSMauro Carvalho Chehab 161d74802adSMauro Carvalho ChehabDo not change or remove the extended attributes attached to cache files by the 162d74802adSMauro Carvalho Chehabcache as this will cause the cache state management to get confused. 163d74802adSMauro Carvalho Chehab 164d74802adSMauro Carvalho ChehabDo not create files or directories in the cache, lest the cache get confused or 165d74802adSMauro Carvalho Chehabserve incorrect data. 166d74802adSMauro Carvalho Chehab 167d74802adSMauro Carvalho ChehabDo not chmod files in the cache. The module creates things with minimal 168d74802adSMauro Carvalho Chehabpermissions to prevent random users being able to access them directly. 169d74802adSMauro Carvalho Chehab 170d74802adSMauro Carvalho Chehab 171d74802adSMauro Carvalho ChehabCache Culling 172d74802adSMauro Carvalho Chehab============= 173d74802adSMauro Carvalho Chehab 174d74802adSMauro Carvalho ChehabThe cache may need culling occasionally to make space. This involves 175d74802adSMauro Carvalho Chehabdiscarding objects from the cache that have been used less recently than 176d74802adSMauro Carvalho Chehabanything else. Culling is based on the access time of data objects. Empty 177d74802adSMauro Carvalho Chehabdirectories are culled if not in use. 178d74802adSMauro Carvalho Chehab 179d74802adSMauro Carvalho ChehabCache culling is done on the basis of the percentage of blocks and the 180d74802adSMauro Carvalho Chehabpercentage of files available in the underlying filesystem. There are six 181d74802adSMauro Carvalho Chehab"limits": 182d74802adSMauro Carvalho Chehab 183d74802adSMauro Carvalho Chehab brun, frun 184d74802adSMauro Carvalho Chehab If the amount of free space and the number of available files in the cache 185d74802adSMauro Carvalho Chehab rises above both these limits, then culling is turned off. 186d74802adSMauro Carvalho Chehab 187d74802adSMauro Carvalho Chehab bcull, fcull 188d74802adSMauro Carvalho Chehab If the amount of available space or the number of available files in the 189d74802adSMauro Carvalho Chehab cache falls below either of these limits, then culling is started. 190d74802adSMauro Carvalho Chehab 191d74802adSMauro Carvalho Chehab bstop, fstop 192d74802adSMauro Carvalho Chehab If the amount of available space or the number of available files in the 193d74802adSMauro Carvalho Chehab cache falls below either of these limits, then no further allocation of 194d74802adSMauro Carvalho Chehab disk space or files is permitted until culling has raised things above 195d74802adSMauro Carvalho Chehab these limits again. 196d74802adSMauro Carvalho Chehab 197d74802adSMauro Carvalho ChehabThese must be configured thusly:: 198d74802adSMauro Carvalho Chehab 199d74802adSMauro Carvalho Chehab 0 <= bstop < bcull < brun < 100 200d74802adSMauro Carvalho Chehab 0 <= fstop < fcull < frun < 100 201d74802adSMauro Carvalho Chehab 202d74802adSMauro Carvalho ChehabNote that these are percentages of available space and available files, and do 203d74802adSMauro Carvalho Chehab_not_ appear as 100 minus the percentage displayed by the "df" program. 204d74802adSMauro Carvalho Chehab 205d74802adSMauro Carvalho ChehabThe userspace daemon scans the cache to build up a table of cullable objects. 206d74802adSMauro Carvalho ChehabThese are then culled in least recently used order. A new scan of the cache is 207d74802adSMauro Carvalho Chehabstarted as soon as space is made in the table. Objects will be skipped if 208d74802adSMauro Carvalho Chehabtheir atimes have changed or if the kernel module says it is still using them. 209d74802adSMauro Carvalho Chehab 210d74802adSMauro Carvalho Chehab 211d74802adSMauro Carvalho ChehabCache Structure 212d74802adSMauro Carvalho Chehab=============== 213d74802adSMauro Carvalho Chehab 214d74802adSMauro Carvalho ChehabThe CacheFiles module will create two directories in the directory it was 215d74802adSMauro Carvalho Chehabgiven: 216d74802adSMauro Carvalho Chehab 217d74802adSMauro Carvalho Chehab * cache/ 218d74802adSMauro Carvalho Chehab * graveyard/ 219d74802adSMauro Carvalho Chehab 220d74802adSMauro Carvalho ChehabThe active cache objects all reside in the first directory. The CacheFiles 221d74802adSMauro Carvalho Chehabkernel module moves any retired or culled objects that it can't simply unlink 222d74802adSMauro Carvalho Chehabto the graveyard from which the daemon will actually delete them. 223d74802adSMauro Carvalho Chehab 224d74802adSMauro Carvalho ChehabThe daemon uses dnotify to monitor the graveyard directory, and will delete 225d74802adSMauro Carvalho Chehabanything that appears therein. 226d74802adSMauro Carvalho Chehab 227d74802adSMauro Carvalho Chehab 228d74802adSMauro Carvalho ChehabThe module represents index objects as directories with the filename "I..." or 229d74802adSMauro Carvalho Chehab"J...". Note that the "cache/" directory is itself a special index. 230d74802adSMauro Carvalho Chehab 231d74802adSMauro Carvalho ChehabData objects are represented as files if they have no children, or directories 232d74802adSMauro Carvalho Chehabif they do. Their filenames all begin "D..." or "E...". If represented as a 233d74802adSMauro Carvalho Chehabdirectory, data objects will have a file in the directory called "data" that 234d74802adSMauro Carvalho Chehabactually holds the data. 235d74802adSMauro Carvalho Chehab 236d74802adSMauro Carvalho ChehabSpecial objects are similar to data objects, except their filenames begin 237d74802adSMauro Carvalho Chehab"S..." or "T...". 238d74802adSMauro Carvalho Chehab 239d74802adSMauro Carvalho Chehab 240d74802adSMauro Carvalho ChehabIf an object has children, then it will be represented as a directory. 241d74802adSMauro Carvalho ChehabImmediately in the representative directory are a collection of directories 242d74802adSMauro Carvalho Chehabnamed for hash values of the child object keys with an '@' prepended. Into 243d74802adSMauro Carvalho Chehabthis directory, if possible, will be placed the representations of the child 244d74802adSMauro Carvalho Chehabobjects:: 245d74802adSMauro Carvalho Chehab 246d74802adSMauro Carvalho Chehab /INDEX /INDEX /INDEX /DATA FILES 247d74802adSMauro Carvalho Chehab /=========/==========/=================================/================ 248d74802adSMauro Carvalho Chehab cache/@4a/I03nfs/@30/Ji000000000000000--fHg8hi8400 249d74802adSMauro Carvalho Chehab cache/@4a/I03nfs/@30/Ji000000000000000--fHg8hi8400/@75/Es0g000w...DB1ry 250d74802adSMauro Carvalho Chehab cache/@4a/I03nfs/@30/Ji000000000000000--fHg8hi8400/@75/Es0g000w...N22ry 251d74802adSMauro Carvalho Chehab cache/@4a/I03nfs/@30/Ji000000000000000--fHg8hi8400/@75/Es0g000w...FP1ry 252d74802adSMauro Carvalho Chehab 253d74802adSMauro Carvalho Chehab 254d74802adSMauro Carvalho ChehabIf the key is so long that it exceeds NAME_MAX with the decorations added on to 255d74802adSMauro Carvalho Chehabit, then it will be cut into pieces, the first few of which will be used to 256d74802adSMauro Carvalho Chehabmake a nest of directories, and the last one of which will be the objects 257d74802adSMauro Carvalho Chehabinside the last directory. The names of the intermediate directories will have 258d74802adSMauro Carvalho Chehab'+' prepended:: 259d74802adSMauro Carvalho Chehab 260d74802adSMauro Carvalho Chehab J1223/@23/+xy...z/+kl...m/Epqr 261d74802adSMauro Carvalho Chehab 262d74802adSMauro Carvalho Chehab 263d74802adSMauro Carvalho ChehabNote that keys are raw data, and not only may they exceed NAME_MAX in size, 264d74802adSMauro Carvalho Chehabthey may also contain things like '/' and NUL characters, and so they may not 265d74802adSMauro Carvalho Chehabbe suitable for turning directly into a filename. 266d74802adSMauro Carvalho Chehab 267d74802adSMauro Carvalho ChehabTo handle this, CacheFiles will use a suitably printable filename directly and 268d74802adSMauro Carvalho Chehab"base-64" encode ones that aren't directly suitable. The two versions of 269d74802adSMauro Carvalho Chehabobject filenames indicate the encoding: 270d74802adSMauro Carvalho Chehab 271d74802adSMauro Carvalho Chehab =============== =============== =============== 272d74802adSMauro Carvalho Chehab OBJECT TYPE PRINTABLE ENCODED 273d74802adSMauro Carvalho Chehab =============== =============== =============== 274d74802adSMauro Carvalho Chehab Index "I..." "J..." 275d74802adSMauro Carvalho Chehab Data "D..." "E..." 276d74802adSMauro Carvalho Chehab Special "S..." "T..." 277d74802adSMauro Carvalho Chehab =============== =============== =============== 278d74802adSMauro Carvalho Chehab 279d74802adSMauro Carvalho ChehabIntermediate directories are always "@" or "+" as appropriate. 280d74802adSMauro Carvalho Chehab 281d74802adSMauro Carvalho Chehab 282d74802adSMauro Carvalho ChehabEach object in the cache has an extended attribute label that holds the object 283d74802adSMauro Carvalho Chehabtype ID (required to distinguish special objects) and the auxiliary data from 284d74802adSMauro Carvalho Chehabthe netfs. The latter is used to detect stale objects in the cache and update 285d74802adSMauro Carvalho Chehabor retire them. 286d74802adSMauro Carvalho Chehab 287d74802adSMauro Carvalho Chehab 288d74802adSMauro Carvalho ChehabNote that CacheFiles will erase from the cache any file it doesn't recognise or 289d74802adSMauro Carvalho Chehabany file of an incorrect type (such as a FIFO file or a device file). 290d74802adSMauro Carvalho Chehab 291d74802adSMauro Carvalho Chehab 292d74802adSMauro Carvalho ChehabSecurity Model and SELinux 293d74802adSMauro Carvalho Chehab========================== 294d74802adSMauro Carvalho Chehab 295d74802adSMauro Carvalho ChehabCacheFiles is implemented to deal properly with the LSM security features of 296d74802adSMauro Carvalho Chehabthe Linux kernel and the SELinux facility. 297d74802adSMauro Carvalho Chehab 298d74802adSMauro Carvalho ChehabOne of the problems that CacheFiles faces is that it is generally acting on 299d74802adSMauro Carvalho Chehabbehalf of a process, and running in that process's context, and that includes a 300d74802adSMauro Carvalho Chehabsecurity context that is not appropriate for accessing the cache - either 301d74802adSMauro Carvalho Chehabbecause the files in the cache are inaccessible to that process, or because if 302d74802adSMauro Carvalho Chehabthe process creates a file in the cache, that file may be inaccessible to other 303d74802adSMauro Carvalho Chehabprocesses. 304d74802adSMauro Carvalho Chehab 305d74802adSMauro Carvalho ChehabThe way CacheFiles works is to temporarily change the security context (fsuid, 306d74802adSMauro Carvalho Chehabfsgid and actor security label) that the process acts as - without changing the 307d74802adSMauro Carvalho Chehabsecurity context of the process when it the target of an operation performed by 308d74802adSMauro Carvalho Chehabsome other process (so signalling and suchlike still work correctly). 309d74802adSMauro Carvalho Chehab 310d74802adSMauro Carvalho Chehab 311d74802adSMauro Carvalho ChehabWhen the CacheFiles module is asked to bind to its cache, it: 312d74802adSMauro Carvalho Chehab 313d74802adSMauro Carvalho Chehab (1) Finds the security label attached to the root cache directory and uses 314d74802adSMauro Carvalho Chehab that as the security label with which it will create files. By default, 315d74802adSMauro Carvalho Chehab this is:: 316d74802adSMauro Carvalho Chehab 317d74802adSMauro Carvalho Chehab cachefiles_var_t 318d74802adSMauro Carvalho Chehab 319d74802adSMauro Carvalho Chehab (2) Finds the security label of the process which issued the bind request 320d74802adSMauro Carvalho Chehab (presumed to be the cachefilesd daemon), which by default will be:: 321d74802adSMauro Carvalho Chehab 322d74802adSMauro Carvalho Chehab cachefilesd_t 323d74802adSMauro Carvalho Chehab 324d74802adSMauro Carvalho Chehab and asks LSM to supply a security ID as which it should act given the 325d74802adSMauro Carvalho Chehab daemon's label. By default, this will be:: 326d74802adSMauro Carvalho Chehab 327d74802adSMauro Carvalho Chehab cachefiles_kernel_t 328d74802adSMauro Carvalho Chehab 329d74802adSMauro Carvalho Chehab SELinux transitions the daemon's security ID to the module's security ID 330d74802adSMauro Carvalho Chehab based on a rule of this form in the policy:: 331d74802adSMauro Carvalho Chehab 332d74802adSMauro Carvalho Chehab type_transition <daemon's-ID> kernel_t : process <module's-ID>; 333d74802adSMauro Carvalho Chehab 334d74802adSMauro Carvalho Chehab For instance:: 335d74802adSMauro Carvalho Chehab 336d74802adSMauro Carvalho Chehab type_transition cachefilesd_t kernel_t : process cachefiles_kernel_t; 337d74802adSMauro Carvalho Chehab 338d74802adSMauro Carvalho Chehab 339d74802adSMauro Carvalho ChehabThe module's security ID gives it permission to create, move and remove files 340d74802adSMauro Carvalho Chehaband directories in the cache, to find and access directories and files in the 341d74802adSMauro Carvalho Chehabcache, to set and access extended attributes on cache objects, and to read and 342d74802adSMauro Carvalho Chehabwrite files in the cache. 343d74802adSMauro Carvalho Chehab 344d74802adSMauro Carvalho ChehabThe daemon's security ID gives it only a very restricted set of permissions: it 345d74802adSMauro Carvalho Chehabmay scan directories, stat files and erase files and directories. It may 346d74802adSMauro Carvalho Chehabnot read or write files in the cache, and so it is precluded from accessing the 347d74802adSMauro Carvalho Chehabdata cached therein; nor is it permitted to create new files in the cache. 348d74802adSMauro Carvalho Chehab 349d74802adSMauro Carvalho Chehab 350d74802adSMauro Carvalho ChehabThere are policy source files available in: 351d74802adSMauro Carvalho Chehab 3527f01cfb9SAlexander A. Klimov https://people.redhat.com/~dhowells/fscache/cachefilesd-0.8.tar.bz2 353d74802adSMauro Carvalho Chehab 354d74802adSMauro Carvalho Chehaband later versions. In that tarball, see the files:: 355d74802adSMauro Carvalho Chehab 356d74802adSMauro Carvalho Chehab cachefilesd.te 357d74802adSMauro Carvalho Chehab cachefilesd.fc 358d74802adSMauro Carvalho Chehab cachefilesd.if 359d74802adSMauro Carvalho Chehab 360d74802adSMauro Carvalho ChehabThey are built and installed directly by the RPM. 361d74802adSMauro Carvalho Chehab 362d74802adSMauro Carvalho ChehabIf a non-RPM based system is being used, then copy the above files to their own 363d74802adSMauro Carvalho Chehabdirectory and run:: 364d74802adSMauro Carvalho Chehab 365d74802adSMauro Carvalho Chehab make -f /usr/share/selinux/devel/Makefile 366d74802adSMauro Carvalho Chehab semodule -i cachefilesd.pp 367d74802adSMauro Carvalho Chehab 368d74802adSMauro Carvalho ChehabYou will need checkpolicy and selinux-policy-devel installed prior to the 369d74802adSMauro Carvalho Chehabbuild. 370d74802adSMauro Carvalho Chehab 371d74802adSMauro Carvalho Chehab 372d74802adSMauro Carvalho ChehabBy default, the cache is located in /var/fscache, but if it is desirable that 373d74802adSMauro Carvalho Chehabit should be elsewhere, than either the above policy files must be altered, or 374d74802adSMauro Carvalho Chehaban auxiliary policy must be installed to label the alternate location of the 375d74802adSMauro Carvalho Chehabcache. 376d74802adSMauro Carvalho Chehab 377d74802adSMauro Carvalho ChehabFor instructions on how to add an auxiliary policy to enable the cache to be 378d74802adSMauro Carvalho Chehablocated elsewhere when SELinux is in enforcing mode, please see:: 379d74802adSMauro Carvalho Chehab 380d74802adSMauro Carvalho Chehab /usr/share/doc/cachefilesd-*/move-cache.txt 381d74802adSMauro Carvalho Chehab 382d74802adSMauro Carvalho ChehabWhen the cachefilesd rpm is installed; alternatively, the document can be found 383d74802adSMauro Carvalho Chehabin the sources. 384d74802adSMauro Carvalho Chehab 385d74802adSMauro Carvalho Chehab 386d74802adSMauro Carvalho ChehabA Note on Security 387d74802adSMauro Carvalho Chehab================== 388d74802adSMauro Carvalho Chehab 389d74802adSMauro Carvalho ChehabCacheFiles makes use of the split security in the task_struct. It allocates 390d74802adSMauro Carvalho Chehabits own task_security structure, and redirects current->cred to point to it 391d74802adSMauro Carvalho Chehabwhen it acts on behalf of another process, in that process's context. 392d74802adSMauro Carvalho Chehab 393d74802adSMauro Carvalho ChehabThe reason it does this is that it calls vfs_mkdir() and suchlike rather than 394d74802adSMauro Carvalho Chehabbypassing security and calling inode ops directly. Therefore the VFS and LSM 395d74802adSMauro Carvalho Chehabmay deny the CacheFiles access to the cache data because under some 396d74802adSMauro Carvalho Chehabcircumstances the caching code is running in the security context of whatever 397d74802adSMauro Carvalho Chehabprocess issued the original syscall on the netfs. 398d74802adSMauro Carvalho Chehab 399d74802adSMauro Carvalho ChehabFurthermore, should CacheFiles create a file or directory, the security 400d74802adSMauro Carvalho Chehabparameters with that object is created (UID, GID, security label) would be 401d74802adSMauro Carvalho Chehabderived from that process that issued the system call, thus potentially 402d74802adSMauro Carvalho Chehabpreventing other processes from accessing the cache - including CacheFiles's 403d74802adSMauro Carvalho Chehabcache management daemon (cachefilesd). 404d74802adSMauro Carvalho Chehab 405d74802adSMauro Carvalho ChehabWhat is required is to temporarily override the security of the process that 406d74802adSMauro Carvalho Chehabissued the system call. We can't, however, just do an in-place change of the 407d74802adSMauro Carvalho Chehabsecurity data as that affects the process as an object, not just as a subject. 408d74802adSMauro Carvalho ChehabThis means it may lose signals or ptrace events for example, and affects what 409d74802adSMauro Carvalho Chehabthe process looks like in /proc. 410d74802adSMauro Carvalho Chehab 411d74802adSMauro Carvalho ChehabSo CacheFiles makes use of a logical split in the security between the 412d74802adSMauro Carvalho Chehabobjective security (task->real_cred) and the subjective security (task->cred). 413d74802adSMauro Carvalho ChehabThe objective security holds the intrinsic security properties of a process and 414d74802adSMauro Carvalho Chehabis never overridden. This is what appears in /proc, and is what is used when a 415d74802adSMauro Carvalho Chehabprocess is the target of an operation by some other process (SIGKILL for 416d74802adSMauro Carvalho Chehabexample). 417d74802adSMauro Carvalho Chehab 418d74802adSMauro Carvalho ChehabThe subjective security holds the active security properties of a process, and 419*d56b699dSBjorn Helgaasmay be overridden. This is not seen externally, and is used when a process 420d74802adSMauro Carvalho Chehabacts upon another object, for example SIGKILLing another process or opening a 421d74802adSMauro Carvalho Chehabfile. 422d74802adSMauro Carvalho Chehab 423d74802adSMauro Carvalho ChehabLSM hooks exist that allow SELinux (or Smack or whatever) to reject a request 424d74802adSMauro Carvalho Chehabfor CacheFiles to run in a context of a specific security label, or to create 425d74802adSMauro Carvalho Chehabfiles and directories with another security label. 426d74802adSMauro Carvalho Chehab 427d74802adSMauro Carvalho Chehab 428d74802adSMauro Carvalho ChehabStatistical Information 429d74802adSMauro Carvalho Chehab======================= 430d74802adSMauro Carvalho Chehab 431d74802adSMauro Carvalho ChehabIf FS-Cache is compiled with the following option enabled:: 432d74802adSMauro Carvalho Chehab 433d74802adSMauro Carvalho Chehab CONFIG_CACHEFILES_HISTOGRAM=y 434d74802adSMauro Carvalho Chehab 435d74802adSMauro Carvalho Chehabthen it will gather certain statistics and display them through a proc file. 436d74802adSMauro Carvalho Chehab 437d74802adSMauro Carvalho Chehab /proc/fs/cachefiles/histogram 438d74802adSMauro Carvalho Chehab 439d74802adSMauro Carvalho Chehab :: 440d74802adSMauro Carvalho Chehab 441d74802adSMauro Carvalho Chehab cat /proc/fs/cachefiles/histogram 442d74802adSMauro Carvalho Chehab JIFS SECS LOOKUPS MKDIRS CREATES 443d74802adSMauro Carvalho Chehab ===== ===== ========= ========= ========= 444d74802adSMauro Carvalho Chehab 445d74802adSMauro Carvalho Chehab This shows the breakdown of the number of times each amount of time 446d74802adSMauro Carvalho Chehab between 0 jiffies and HZ-1 jiffies a variety of tasks took to run. The 447d74802adSMauro Carvalho Chehab columns are as follows: 448d74802adSMauro Carvalho Chehab 449d74802adSMauro Carvalho Chehab ======= ======================================================= 450d74802adSMauro Carvalho Chehab COLUMN TIME MEASUREMENT 451d74802adSMauro Carvalho Chehab ======= ======================================================= 452d74802adSMauro Carvalho Chehab LOOKUPS Length of time to perform a lookup on the backing fs 453d74802adSMauro Carvalho Chehab MKDIRS Length of time to perform a mkdir on the backing fs 454d74802adSMauro Carvalho Chehab CREATES Length of time to perform a create on the backing fs 455d74802adSMauro Carvalho Chehab ======= ======================================================= 456d74802adSMauro Carvalho Chehab 457d74802adSMauro Carvalho Chehab Each row shows the number of events that took a particular range of times. 458d74802adSMauro Carvalho Chehab Each step is 1 jiffy in size. The JIFS column indicates the particular 459d74802adSMauro Carvalho Chehab jiffy range covered, and the SECS field the equivalent number of seconds. 460d74802adSMauro Carvalho Chehab 461d74802adSMauro Carvalho Chehab 462d74802adSMauro Carvalho ChehabDebugging 463d74802adSMauro Carvalho Chehab========= 464d74802adSMauro Carvalho Chehab 465d74802adSMauro Carvalho ChehabIf CONFIG_CACHEFILES_DEBUG is enabled, the CacheFiles facility can have runtime 466d74802adSMauro Carvalho Chehabdebugging enabled by adjusting the value in:: 467d74802adSMauro Carvalho Chehab 468d74802adSMauro Carvalho Chehab /sys/module/cachefiles/parameters/debug 469d74802adSMauro Carvalho Chehab 470d74802adSMauro Carvalho ChehabThis is a bitmask of debugging streams to enable: 471d74802adSMauro Carvalho Chehab 472d74802adSMauro Carvalho Chehab ======= ======= =============================== ======================= 473d74802adSMauro Carvalho Chehab BIT VALUE STREAM POINT 474d74802adSMauro Carvalho Chehab ======= ======= =============================== ======================= 475d74802adSMauro Carvalho Chehab 0 1 General Function entry trace 476d74802adSMauro Carvalho Chehab 1 2 Function exit trace 477d74802adSMauro Carvalho Chehab 2 4 General 478d74802adSMauro Carvalho Chehab ======= ======= =============================== ======================= 479d74802adSMauro Carvalho Chehab 480d74802adSMauro Carvalho ChehabThe appropriate set of values should be OR'd together and the result written to 481d74802adSMauro Carvalho Chehabthe control file. For example:: 482d74802adSMauro Carvalho Chehab 483d74802adSMauro Carvalho Chehab echo $((1|4|8)) >/sys/module/cachefiles/parameters/debug 484d74802adSMauro Carvalho Chehab 485d74802adSMauro Carvalho Chehabwill turn on all function entry debugging. 48699302ebdSJeffle Xu 48799302ebdSJeffle Xu 48899302ebdSJeffle XuOn-demand Read 48999302ebdSJeffle Xu============== 49099302ebdSJeffle Xu 49199302ebdSJeffle XuWhen working in its original mode, CacheFiles serves as a local cache for a 49299302ebdSJeffle Xuremote networking fs - while in on-demand read mode, CacheFiles can boost the 49399302ebdSJeffle Xuscenario where on-demand read semantics are needed, e.g. container image 49499302ebdSJeffle Xudistribution. 49599302ebdSJeffle Xu 49699302ebdSJeffle XuThe essential difference between these two modes is seen when a cache miss 49799302ebdSJeffle Xuoccurs: In the original mode, the netfs will fetch the data from the remote 49899302ebdSJeffle Xuserver and then write it to the cache file; in on-demand read mode, fetching 49999302ebdSJeffle Xuthe data and writing it into the cache is delegated to a user daemon. 50099302ebdSJeffle Xu 50199302ebdSJeffle Xu``CONFIG_CACHEFILES_ONDEMAND`` should be enabled to support on-demand read mode. 50299302ebdSJeffle Xu 50399302ebdSJeffle Xu 50499302ebdSJeffle XuProtocol Communication 50599302ebdSJeffle Xu---------------------- 50699302ebdSJeffle Xu 50799302ebdSJeffle XuThe on-demand read mode uses a simple protocol for communication between kernel 50899302ebdSJeffle Xuand user daemon. The protocol can be modeled as:: 50999302ebdSJeffle Xu 51099302ebdSJeffle Xu kernel --[request]--> user daemon --[reply]--> kernel 51199302ebdSJeffle Xu 51299302ebdSJeffle XuCacheFiles will send requests to the user daemon when needed. The user daemon 51399302ebdSJeffle Xushould poll the devnode ('/dev/cachefiles') to check if there's a pending 51499302ebdSJeffle Xurequest to be processed. A POLLIN event will be returned when there's a pending 51599302ebdSJeffle Xurequest. 51699302ebdSJeffle Xu 51799302ebdSJeffle XuThe user daemon then reads the devnode to fetch a request to process. It should 51899302ebdSJeffle Xube noted that each read only gets one request. When it has finished processing 51999302ebdSJeffle Xuthe request, the user daemon should write the reply to the devnode. 52099302ebdSJeffle Xu 52199302ebdSJeffle XuEach request starts with a message header of the form:: 52299302ebdSJeffle Xu 52399302ebdSJeffle Xu struct cachefiles_msg { 52499302ebdSJeffle Xu __u32 msg_id; 52599302ebdSJeffle Xu __u32 opcode; 52699302ebdSJeffle Xu __u32 len; 52799302ebdSJeffle Xu __u32 object_id; 52899302ebdSJeffle Xu __u8 data[]; 52999302ebdSJeffle Xu }; 53099302ebdSJeffle Xu 53199302ebdSJeffle Xuwhere: 53299302ebdSJeffle Xu 53399302ebdSJeffle Xu * ``msg_id`` is a unique ID identifying this request among all pending 53499302ebdSJeffle Xu requests. 53599302ebdSJeffle Xu 53699302ebdSJeffle Xu * ``opcode`` indicates the type of this request. 53799302ebdSJeffle Xu 53899302ebdSJeffle Xu * ``object_id`` is a unique ID identifying the cache file operated on. 53999302ebdSJeffle Xu 54099302ebdSJeffle Xu * ``data`` indicates the payload of this request. 54199302ebdSJeffle Xu 54299302ebdSJeffle Xu * ``len`` indicates the whole length of this request, including the 54399302ebdSJeffle Xu header and following type-specific payload. 54499302ebdSJeffle Xu 54599302ebdSJeffle Xu 54699302ebdSJeffle XuTurning on On-demand Mode 54799302ebdSJeffle Xu------------------------- 54899302ebdSJeffle Xu 54999302ebdSJeffle XuAn optional parameter becomes available to the "bind" command:: 55099302ebdSJeffle Xu 55199302ebdSJeffle Xu bind [ondemand] 55299302ebdSJeffle Xu 55399302ebdSJeffle XuWhen the "bind" command is given no argument, it defaults to the original mode. 55499302ebdSJeffle XuWhen it is given the "ondemand" argument, i.e. "bind ondemand", on-demand read 55599302ebdSJeffle Xumode will be enabled. 55699302ebdSJeffle Xu 55799302ebdSJeffle Xu 55899302ebdSJeffle XuThe OPEN Request 55999302ebdSJeffle Xu---------------- 56099302ebdSJeffle Xu 56199302ebdSJeffle XuWhen the netfs opens a cache file for the first time, a request with the 56299302ebdSJeffle XuCACHEFILES_OP_OPEN opcode, a.k.a an OPEN request will be sent to the user 56399302ebdSJeffle Xudaemon. The payload format is of the form:: 56499302ebdSJeffle Xu 56599302ebdSJeffle Xu struct cachefiles_open { 56699302ebdSJeffle Xu __u32 volume_key_size; 56799302ebdSJeffle Xu __u32 cookie_key_size; 56899302ebdSJeffle Xu __u32 fd; 56999302ebdSJeffle Xu __u32 flags; 57099302ebdSJeffle Xu __u8 data[]; 57199302ebdSJeffle Xu }; 57299302ebdSJeffle Xu 57399302ebdSJeffle Xuwhere: 57499302ebdSJeffle Xu 57599302ebdSJeffle Xu * ``data`` contains the volume_key followed directly by the cookie_key. 57699302ebdSJeffle Xu The volume key is a NUL-terminated string; the cookie key is binary 57799302ebdSJeffle Xu data. 57899302ebdSJeffle Xu 57999302ebdSJeffle Xu * ``volume_key_size`` indicates the size of the volume key in bytes. 58099302ebdSJeffle Xu 58199302ebdSJeffle Xu * ``cookie_key_size`` indicates the size of the cookie key in bytes. 58299302ebdSJeffle Xu 58399302ebdSJeffle Xu * ``fd`` indicates an anonymous fd referring to the cache file, through 58499302ebdSJeffle Xu which the user daemon can perform write/llseek file operations on the 58599302ebdSJeffle Xu cache file. 58699302ebdSJeffle Xu 58799302ebdSJeffle Xu 58899302ebdSJeffle XuThe user daemon can use the given (volume_key, cookie_key) pair to distinguish 58999302ebdSJeffle Xuthe requested cache file. With the given anonymous fd, the user daemon can 59099302ebdSJeffle Xufetch the data and write it to the cache file in the background, even when 59199302ebdSJeffle Xukernel has not triggered a cache miss yet. 59299302ebdSJeffle Xu 59399302ebdSJeffle XuBe noted that each cache file has a unique object_id, while it may have multiple 59499302ebdSJeffle Xuanonymous fds. The user daemon may duplicate anonymous fds from the initial 59599302ebdSJeffle Xuanonymous fd indicated by the @fd field through dup(). Thus each object_id can 59699302ebdSJeffle Xube mapped to multiple anonymous fds, while the usr daemon itself needs to 59799302ebdSJeffle Xumaintain the mapping. 59899302ebdSJeffle Xu 59999302ebdSJeffle XuWhen implementing a user daemon, please be careful of RLIMIT_NOFILE, 60099302ebdSJeffle Xu``/proc/sys/fs/nr_open`` and ``/proc/sys/fs/file-max``. Typically these needn't 60199302ebdSJeffle Xube huge since they're related to the number of open device blobs rather than 60299302ebdSJeffle Xuopen files of each individual filesystem. 60399302ebdSJeffle Xu 60499302ebdSJeffle XuThe user daemon should reply the OPEN request by issuing a "copen" (complete 60599302ebdSJeffle Xuopen) command on the devnode:: 60699302ebdSJeffle Xu 60799302ebdSJeffle Xu copen <msg_id>,<cache_size> 60899302ebdSJeffle Xu 60999302ebdSJeffle Xuwhere: 61099302ebdSJeffle Xu 61199302ebdSJeffle Xu * ``msg_id`` must match the msg_id field of the OPEN request. 61299302ebdSJeffle Xu 61399302ebdSJeffle Xu * When >= 0, ``cache_size`` indicates the size of the cache file; 61499302ebdSJeffle Xu when < 0, ``cache_size`` indicates any error code encountered by the 61599302ebdSJeffle Xu user daemon. 61699302ebdSJeffle Xu 61799302ebdSJeffle Xu 61899302ebdSJeffle XuThe CLOSE Request 61999302ebdSJeffle Xu----------------- 62099302ebdSJeffle Xu 62199302ebdSJeffle XuWhen a cookie withdrawn, a CLOSE request (opcode CACHEFILES_OP_CLOSE) will be 62299302ebdSJeffle Xusent to the user daemon. This tells the user daemon to close all anonymous fds 62399302ebdSJeffle Xuassociated with the given object_id. The CLOSE request has no extra payload, 62499302ebdSJeffle Xuand shouldn't be replied. 62599302ebdSJeffle Xu 62699302ebdSJeffle Xu 62799302ebdSJeffle XuThe READ Request 62899302ebdSJeffle Xu---------------- 62999302ebdSJeffle Xu 63099302ebdSJeffle XuWhen a cache miss is encountered in on-demand read mode, CacheFiles will send a 63199302ebdSJeffle XuREAD request (opcode CACHEFILES_OP_READ) to the user daemon. This tells the user 63299302ebdSJeffle Xudaemon to fetch the contents of the requested file range. The payload is of the 63399302ebdSJeffle Xuform:: 63499302ebdSJeffle Xu 63599302ebdSJeffle Xu struct cachefiles_read { 63699302ebdSJeffle Xu __u64 off; 63799302ebdSJeffle Xu __u64 len; 63899302ebdSJeffle Xu }; 63999302ebdSJeffle Xu 64099302ebdSJeffle Xuwhere: 64199302ebdSJeffle Xu 64299302ebdSJeffle Xu * ``off`` indicates the starting offset of the requested file range. 64399302ebdSJeffle Xu 64499302ebdSJeffle Xu * ``len`` indicates the length of the requested file range. 64599302ebdSJeffle Xu 64699302ebdSJeffle Xu 64799302ebdSJeffle XuWhen it receives a READ request, the user daemon should fetch the requested data 64899302ebdSJeffle Xuand write it to the cache file identified by object_id. 64999302ebdSJeffle Xu 65099302ebdSJeffle XuWhen it has finished processing the READ request, the user daemon should reply 65199302ebdSJeffle Xuby using the CACHEFILES_IOC_READ_COMPLETE ioctl on one of the anonymous fds 65299302ebdSJeffle Xuassociated with the object_id given in the READ request. The ioctl is of the 65399302ebdSJeffle Xuform:: 65499302ebdSJeffle Xu 65599302ebdSJeffle Xu ioctl(fd, CACHEFILES_IOC_READ_COMPLETE, msg_id); 65699302ebdSJeffle Xu 65799302ebdSJeffle Xuwhere: 65899302ebdSJeffle Xu 65999302ebdSJeffle Xu * ``fd`` is one of the anonymous fds associated with the object_id 66099302ebdSJeffle Xu given. 66199302ebdSJeffle Xu 66299302ebdSJeffle Xu * ``msg_id`` must match the msg_id field of the READ request. 663