1.. SPDX-License-Identifier: GPL-2.0 2 3:orphan: 4 5.. UBIFS Authentication 6.. sigma star gmbh 7.. 2018 8 9Introduction 10============ 11 12UBIFS utilizes the fscrypt framework to provide confidentiality for file 13contents and file names. This prevents attacks where an attacker is able to 14read contents of the filesystem on a single point in time. A classic example 15is a lost smartphone where the attacker is unable to read personal data stored 16on the device without the filesystem decryption key. 17 18At the current state, UBIFS encryption however does not prevent attacks where 19the attacker is able to modify the filesystem contents and the user uses the 20device afterwards. In such a scenario an attacker can modify filesystem 21contents arbitrarily without the user noticing. One example is to modify a 22binary to perform a malicious action when executed [DMC-CBC-ATTACK]. Since 23most of the filesystem metadata of UBIFS is stored in plain, this makes it 24fairly easy to swap files and replace their contents. 25 26Other full disk encryption systems like dm-crypt cover all filesystem metadata, 27which makes such kinds of attacks more complicated, but not impossible. 28Especially, if the attacker is given access to the device multiple points in 29time. For dm-crypt and other filesystems that build upon the Linux block IO 30layer, the dm-integrity or dm-verity subsystems [DM-INTEGRITY, DM-VERITY] 31can be used to get full data authentication at the block layer. 32These can also be combined with dm-crypt [CRYPTSETUP2]. 33 34This document describes an approach to get file contents _and_ full metadata 35authentication for UBIFS. Since UBIFS uses fscrypt for file contents and file 36name encryption, the authentication system could be tied into fscrypt such that 37existing features like key derivation can be utilized. It should however also 38be possible to use UBIFS authentication without using encryption. 39 40 41MTD, UBI & UBIFS 42---------------- 43 44On Linux, the MTD (Memory Technology Devices) subsystem provides a uniform 45interface to access raw flash devices. One of the more prominent subsystems that 46work on top of MTD is UBI (Unsorted Block Images). It provides volume management 47for flash devices and is thus somewhat similar to LVM for block devices. In 48addition, it deals with flash-specific wear-leveling and transparent I/O error 49handling. UBI offers logical erase blocks (LEBs) to the layers on top of it 50and maps them transparently to physical erase blocks (PEBs) on the flash. 51 52UBIFS is a filesystem for raw flash which operates on top of UBI. Thus, wear 53leveling and some flash specifics are left to UBI, while UBIFS focuses on 54scalability, performance and recoverability. 55 56:: 57 58 +------------+ +*******+ +-----------+ +-----+ 59 | | * UBIFS * | UBI-BLOCK | | ... | 60 | JFFS/JFFS2 | +*******+ +-----------+ +-----+ 61 | | +-----------------------------+ +-----------+ +-----+ 62 | | | UBI | | MTD-BLOCK | | ... | 63 +------------+ +-----------------------------+ +-----------+ +-----+ 64 +------------------------------------------------------------------+ 65 | MEMORY TECHNOLOGY DEVICES (MTD) | 66 +------------------------------------------------------------------+ 67 +-----------------------------+ +--------------------------+ +-----+ 68 | NAND DRIVERS | | NOR DRIVERS | | ... | 69 +-----------------------------+ +--------------------------+ +-----+ 70 71 Figure 1: Linux kernel subsystems for dealing with raw flash 72 73 74 75Internally, UBIFS maintains multiple data structures which are persisted on 76the flash: 77 78- *Index*: an on-flash B+ tree where the leaf nodes contain filesystem data 79- *Journal*: an additional data structure to collect FS changes before updating 80 the on-flash index and reduce flash wear. 81- *Tree Node Cache (TNC)*: an in-memory B+ tree that reflects the current FS 82 state to avoid frequent flash reads. It is basically the in-memory 83 representation of the index, but contains additional attributes. 84- *LEB property tree (LPT)*: an on-flash B+ tree for free space accounting per 85 UBI LEB. 86 87In the remainder of this section we will cover the on-flash UBIFS data 88structures in more detail. The TNC is of less importance here since it is never 89persisted onto the flash directly. More details on UBIFS can also be found in 90[UBIFS-WP]. 91 92 93UBIFS Index & Tree Node Cache 94~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 95 96Basic on-flash UBIFS entities are called *nodes*. UBIFS knows different types 97of nodes. Eg. data nodes (``struct ubifs_data_node``) which store chunks of file 98contents or inode nodes (``struct ubifs_ino_node``) which represent VFS inodes. 99Almost all types of nodes share a common header (``ubifs_ch``) containing basic 100information like node type, node length, a sequence number, etc. (see 101``fs/ubifs/ubifs-media.h`` in kernel source). Exceptions are entries of the LPT 102and some less important node types like padding nodes which are used to pad 103unusable content at the end of LEBs. 104 105To avoid re-writing the whole B+ tree on every single change, it is implemented 106as *wandering tree*, where only the changed nodes are re-written and previous 107versions of them are obsoleted without erasing them right away. As a result, 108the index is not stored in a single place on the flash, but *wanders* around 109and there are obsolete parts on the flash as long as the LEB containing them is 110not reused by UBIFS. To find the most recent version of the index, UBIFS stores 111a special node called *master node* into UBI LEB 1 which always points to the 112most recent root node of the UBIFS index. For recoverability, the master node 113is additionally duplicated to LEB 2. Mounting UBIFS is thus a simple read of 114LEB 1 and 2 to get the current master node and from there get the location of 115the most recent on-flash index. 116 117The TNC is the in-memory representation of the on-flash index. It contains some 118additional runtime attributes per node which are not persisted. One of these is 119a dirty-flag which marks nodes that have to be persisted the next time the 120index is written onto the flash. The TNC acts as a write-back cache and all 121modifications of the on-flash index are done through the TNC. Like other caches, 122the TNC does not have to mirror the full index into memory, but reads parts of 123it from flash whenever needed. A *commit* is the UBIFS operation of updating the 124on-flash filesystem structures like the index. On every commit, the TNC nodes 125marked as dirty are written to the flash to update the persisted index. 126 127 128Journal 129~~~~~~~ 130 131To avoid wearing out the flash, the index is only persisted (*commited*) when 132certain conditions are met (eg. ``fsync(2)``). The journal is used to record 133any changes (in form of inode nodes, data nodes etc.) between commits 134of the index. During mount, the journal is read from the flash and replayed 135onto the TNC (which will be created on-demand from the on-flash index). 136 137UBIFS reserves a bunch of LEBs just for the journal called *log area*. The 138amount of log area LEBs is configured on filesystem creation (using 139``mkfs.ubifs``) and stored in the superblock node. The log area contains only 140two types of nodes: *reference nodes* and *commit start nodes*. A commit start 141node is written whenever an index commit is performed. Reference nodes are 142written on every journal update. Each reference node points to the position of 143other nodes (inode nodes, data nodes etc.) on the flash that are part of this 144journal entry. These nodes are called *buds* and describe the actual filesystem 145changes including their data. 146 147The log area is maintained as a ring. Whenever the journal is almost full, 148a commit is initiated. This also writes a commit start node so that during 149mount, UBIFS will seek for the most recent commit start node and just replay 150every reference node after that. Every reference node before the commit start 151node will be ignored as they are already part of the on-flash index. 152 153When writing a journal entry, UBIFS first ensures that enough space is 154available to write the reference node and buds part of this entry. Then, the 155reference node is written and afterwards the buds describing the file changes. 156On replay, UBIFS will record every reference node and inspect the location of 157the referenced LEBs to discover the buds. If these are corrupt or missing, 158UBIFS will attempt to recover them by re-reading the LEB. This is however only 159done for the last referenced LEB of the journal. Only this can become corrupt 160because of a power cut. If the recovery fails, UBIFS will not mount. An error 161for every other LEB will directly cause UBIFS to fail the mount operation. 162 163:: 164 165 | ---- LOG AREA ---- | ---------- MAIN AREA ------------ | 166 167 -----+------+-----+--------+---- ------+-----+-----+--------------- 168 \ | | | | / / | | | \ 169 / CS | REF | REF | | \ \ DENT | INO | INO | / 170 \ | | | | / / | | | \ 171 ----+------+-----+--------+--- -------+-----+-----+---------------- 172 | | ^ ^ 173 | | | | 174 +------------------------+ | 175 | | 176 +-------------------------------+ 177 178 179 Figure 2: UBIFS flash layout of log area with commit start nodes 180 (CS) and reference nodes (REF) pointing to main area 181 containing their buds 182 183 184LEB Property Tree/Table 185~~~~~~~~~~~~~~~~~~~~~~~ 186 187The LEB property tree is used to store per-LEB information. This includes the 188LEB type and amount of free and *dirty* (old, obsolete content) space [1]_ on 189the LEB. The type is important, because UBIFS never mixes index nodes with data 190nodes on a single LEB and thus each LEB has a specific purpose. This again is 191useful for free space calculations. See [UBIFS-WP] for more details. 192 193The LEB property tree again is a B+ tree, but it is much smaller than the 194index. Due to its smaller size it is always written as one chunk on every 195commit. Thus, saving the LPT is an atomic operation. 196 197 198.. [1] Since LEBs can only be appended and never overwritten, there is a 199 difference between free space ie. the remaining space left on the LEB to be 200 written to without erasing it and previously written content that is obsolete 201 but can't be overwritten without erasing the full LEB. 202 203 204UBIFS Authentication 205==================== 206 207This chapter introduces UBIFS authentication which enables UBIFS to verify 208the authenticity and integrity of metadata and file contents stored on flash. 209 210 211Threat Model 212------------ 213 214UBIFS authentication enables detection of offline data modification. While it 215does not prevent it, it enables (trusted) code to check the integrity and 216authenticity of on-flash file contents and filesystem metadata. This covers 217attacks where file contents are swapped. 218 219UBIFS authentication will not protect against rollback of full flash contents. 220Ie. an attacker can still dump the flash and restore it at a later time without 221detection. It will also not protect against partial rollback of individual 222index commits. That means that an attacker is able to partially undo changes. 223This is possible because UBIFS does not immediately overwrites obsolete 224versions of the index tree or the journal, but instead marks them as obsolete 225and garbage collection erases them at a later time. An attacker can use this by 226erasing parts of the current tree and restoring old versions that are still on 227the flash and have not yet been erased. This is possible, because every commit 228will always write a new version of the index root node and the master node 229without overwriting the previous version. This is further helped by the 230wear-leveling operations of UBI which copies contents from one physical 231eraseblock to another and does not atomically erase the first eraseblock. 232 233UBIFS authentication does not cover attacks where an attacker is able to 234execute code on the device after the authentication key was provided. 235Additional measures like secure boot and trusted boot have to be taken to 236ensure that only trusted code is executed on a device. 237 238 239Authentication 240-------------- 241 242To be able to fully trust data read from flash, all UBIFS data structures 243stored on flash are authenticated. That is: 244 245- The index which includes file contents, file metadata like extended 246 attributes, file length etc. 247- The journal which also contains file contents and metadata by recording changes 248 to the filesystem 249- The LPT which stores UBI LEB metadata which UBIFS uses for free space accounting 250 251 252Index Authentication 253~~~~~~~~~~~~~~~~~~~~ 254 255Through UBIFS' concept of a wandering tree, it already takes care of only 256updating and persisting changed parts from leaf node up to the root node 257of the full B+ tree. This enables us to augment the index nodes of the tree 258with a hash over each node's child nodes. As a result, the index basically also 259a Merkle tree. Since the leaf nodes of the index contain the actual filesystem 260data, the hashes of their parent index nodes thus cover all the file contents 261and file metadata. When a file changes, the UBIFS index is updated accordingly 262from the leaf nodes up to the root node including the master node. This process 263can be hooked to recompute the hash only for each changed node at the same time. 264Whenever a file is read, UBIFS can verify the hashes from each leaf node up to 265the root node to ensure the node's integrity. 266 267To ensure the authenticity of the whole index, the UBIFS master node stores a 268keyed hash (HMAC) over its own contents and a hash of the root node of the index 269tree. As mentioned above, the master node is always written to the flash whenever 270the index is persisted (ie. on index commit). 271 272Using this approach only UBIFS index nodes and the master node are changed to 273include a hash. All other types of nodes will remain unchanged. This reduces 274the storage overhead which is precious for users of UBIFS (ie. embedded 275devices). 276 277:: 278 279 +---------------+ 280 | Master Node | 281 | (hash) | 282 +---------------+ 283 | 284 v 285 +-------------------+ 286 | Index Node #1 | 287 | | 288 | branch0 branchn | 289 | (hash) (hash) | 290 +-------------------+ 291 | ... | (fanout: 8) 292 | | 293 +-------+ +------+ 294 | | 295 v v 296 +-------------------+ +-------------------+ 297 | Index Node #2 | | Index Node #3 | 298 | | | | 299 | branch0 branchn | | branch0 branchn | 300 | (hash) (hash) | | (hash) (hash) | 301 +-------------------+ +-------------------+ 302 | ... | ... | 303 v v v 304 +-----------+ +----------+ +-----------+ 305 | Data Node | | INO Node | | DENT Node | 306 +-----------+ +----------+ +-----------+ 307 308 309 Figure 3: Coverage areas of index node hash and master node HMAC 310 311 312 313The most important part for robustness and power-cut safety is to atomically 314persist the hash and file contents. Here the existing UBIFS logic for how 315changed nodes are persisted is already designed for this purpose such that 316UBIFS can safely recover if a power-cut occurs while persisting. Adding 317hashes to index nodes does not change this since each hash will be persisted 318atomically together with its respective node. 319 320 321Journal Authentication 322~~~~~~~~~~~~~~~~~~~~~~ 323 324The journal is authenticated too. Since the journal is continuously written 325it is necessary to also add authentication information frequently to the 326journal so that in case of a powercut not too much data can't be authenticated. 327This is done by creating a continuous hash beginning from the commit start node 328over the previous reference nodes, the current reference node, and the bud 329nodes. From time to time whenever it is suitable authentication nodes are added 330between the bud nodes. This new node type contains a HMAC over the current state 331of the hash chain. That way a journal can be authenticated up to the last 332authentication node. The tail of the journal which may not have a authentication 333node cannot be authenticated and is skipped during journal replay. 334 335We get this picture for journal authentication:: 336 337 ,,,,,,,, 338 ,......,........................................... 339 ,. CS , hash1.----. hash2.----. 340 ,. | , . |hmac . |hmac 341 ,. v , . v . v 342 ,.REF#0,-> bud -> bud -> bud.-> auth -> bud -> bud.-> auth ... 343 ,..|...,........................................... 344 , | , 345 , | ,,,,,,,,,,,,,,, 346 . | hash3,----. 347 , | , |hmac 348 , v , v 349 , REF#1 -> bud -> bud,-> auth ... 350 ,,,|,,,,,,,,,,,,,,,,,, 351 v 352 REF#2 -> ... 353 | 354 V 355 ... 356 357Since the hash also includes the reference nodes an attacker cannot reorder or 358skip any journal heads for replay. An attacker can only remove bud nodes or 359reference nodes from the end of the journal, effectively rewinding the 360filesystem at maximum back to the last commit. 361 362The location of the log area is stored in the master node. Since the master 363node is authenticated with a HMAC as described above, it is not possible to 364tamper with that without detection. The size of the log area is specified when 365the filesystem is created using `mkfs.ubifs` and stored in the superblock node. 366To avoid tampering with this and other values stored there, a HMAC is added to 367the superblock struct. The superblock node is stored in LEB 0 and is only 368modified on feature flag or similar changes, but never on file changes. 369 370 371LPT Authentication 372~~~~~~~~~~~~~~~~~~ 373 374The location of the LPT root node on the flash is stored in the UBIFS master 375node. Since the LPT is written and read atomically on every commit, there is 376no need to authenticate individual nodes of the tree. It suffices to 377protect the integrity of the full LPT by a simple hash stored in the master 378node. Since the master node itself is authenticated, the LPTs authenticity can 379be verified by verifying the authenticity of the master node and comparing the 380LTP hash stored there with the hash computed from the read on-flash LPT. 381 382 383Key Management 384-------------- 385 386For simplicity, UBIFS authentication uses a single key to compute the HMACs 387of superblock, master, commit start and reference nodes. This key has to be 388available on creation of the filesystem (`mkfs.ubifs`) to authenticate the 389superblock node. Further, it has to be available on mount of the filesystem 390to verify authenticated nodes and generate new HMACs for changes. 391 392UBIFS authentication is intended to operate side-by-side with UBIFS encryption 393(fscrypt) to provide confidentiality and authenticity. Since UBIFS encryption 394has a different approach of encryption policies per directory, there can be 395multiple fscrypt master keys and there might be folders without encryption. 396UBIFS authentication on the other hand has an all-or-nothing approach in the 397sense that it either authenticates everything of the filesystem or nothing. 398Because of this and because UBIFS authentication should also be usable without 399encryption, it does not share the same master key with fscrypt, but manages 400a dedicated authentication key. 401 402The API for providing the authentication key has yet to be defined, but the 403key can eg. be provided by userspace through a keyring similar to the way it 404is currently done in fscrypt. It should however be noted that the current 405fscrypt approach has shown its flaws and the userspace API will eventually 406change [FSCRYPT-POLICY2]. 407 408Nevertheless, it will be possible for a user to provide a single passphrase 409or key in userspace that covers UBIFS authentication and encryption. This can 410be solved by the corresponding userspace tools which derive a second key for 411authentication in addition to the derived fscrypt master key used for 412encryption. 413 414To be able to check if the proper key is available on mount, the UBIFS 415superblock node will additionally store a hash of the authentication key. This 416approach is similar to the approach proposed for fscrypt encryption policy v2 417[FSCRYPT-POLICY2]. 418 419 420Future Extensions 421================= 422 423In certain cases where a vendor wants to provide an authenticated filesystem 424image to customers, it should be possible to do so without sharing the secret 425UBIFS authentication key. Instead, in addition the each HMAC a digital 426signature could be stored where the vendor shares the public key alongside the 427filesystem image. In case this filesystem has to be modified afterwards, 428UBIFS can exchange all digital signatures with HMACs on first mount similar 429to the way the IMA/EVM subsystem deals with such situations. The HMAC key 430will then have to be provided beforehand in the normal way. 431 432 433References 434========== 435 436[CRYPTSETUP2] http://www.saout.de/pipermail/dm-crypt/2017-November/005745.html 437 438[DMC-CBC-ATTACK] http://www.jakoblell.com/blog/2013/12/22/practical-malleability-attack-against-cbc-encrypted-luks-partitions/ 439 440[DM-INTEGRITY] https://www.kernel.org/doc/Documentation/device-mapper/dm-integrity.rst 441 442[DM-VERITY] https://www.kernel.org/doc/Documentation/device-mapper/verity.rst 443 444[FSCRYPT-POLICY2] https://www.spinics.net/lists/linux-ext4/msg58710.html 445 446[UBIFS-WP] http://www.linux-mtd.infradead.org/doc/ubifs_whitepaper.pdf 447