1688f118eSMauro Carvalho Chehab.. SPDX-License-Identifier: GPL-2.0
2688f118eSMauro Carvalho Chehab
309f4c750SMauro Carvalho Chehab.. UBIFS Authentication
409f4c750SMauro Carvalho Chehab.. sigma star gmbh
509f4c750SMauro Carvalho Chehab.. 2018
609f4c750SMauro Carvalho Chehab
77eec6756SJonathan Neuschäfer============================
87eec6756SJonathan NeuschäferUBIFS Authentication Support
97eec6756SJonathan Neuschäfer============================
107eec6756SJonathan Neuschäfer
1109f4c750SMauro Carvalho ChehabIntroduction
1209f4c750SMauro Carvalho Chehab============
1309f4c750SMauro Carvalho Chehab
1409f4c750SMauro Carvalho ChehabUBIFS utilizes the fscrypt framework to provide confidentiality for file
1509f4c750SMauro Carvalho Chehabcontents and file names. This prevents attacks where an attacker is able to
1609f4c750SMauro Carvalho Chehabread contents of the filesystem on a single point in time. A classic example
1709f4c750SMauro Carvalho Chehabis a lost smartphone where the attacker is unable to read personal data stored
1809f4c750SMauro Carvalho Chehabon the device without the filesystem decryption key.
1909f4c750SMauro Carvalho Chehab
2009f4c750SMauro Carvalho ChehabAt the current state, UBIFS encryption however does not prevent attacks where
2109f4c750SMauro Carvalho Chehabthe attacker is able to modify the filesystem contents and the user uses the
2209f4c750SMauro Carvalho Chehabdevice afterwards. In such a scenario an attacker can modify filesystem
2309f4c750SMauro Carvalho Chehabcontents arbitrarily without the user noticing. One example is to modify a
2409f4c750SMauro Carvalho Chehabbinary to perform a malicious action when executed [DMC-CBC-ATTACK]. Since
2509f4c750SMauro Carvalho Chehabmost of the filesystem metadata of UBIFS is stored in plain, this makes it
2609f4c750SMauro Carvalho Chehabfairly easy to swap files and replace their contents.
2709f4c750SMauro Carvalho Chehab
2809f4c750SMauro Carvalho ChehabOther full disk encryption systems like dm-crypt cover all filesystem metadata,
2909f4c750SMauro Carvalho Chehabwhich makes such kinds of attacks more complicated, but not impossible.
3009f4c750SMauro Carvalho ChehabEspecially, if the attacker is given access to the device multiple points in
3109f4c750SMauro Carvalho Chehabtime. For dm-crypt and other filesystems that build upon the Linux block IO
3209f4c750SMauro Carvalho Chehablayer, the dm-integrity or dm-verity subsystems [DM-INTEGRITY, DM-VERITY]
3309f4c750SMauro Carvalho Chehabcan be used to get full data authentication at the block layer.
3409f4c750SMauro Carvalho ChehabThese can also be combined with dm-crypt [CRYPTSETUP2].
3509f4c750SMauro Carvalho Chehab
3609f4c750SMauro Carvalho ChehabThis document describes an approach to get file contents _and_ full metadata
3709f4c750SMauro Carvalho Chehabauthentication for UBIFS. Since UBIFS uses fscrypt for file contents and file
3809f4c750SMauro Carvalho Chehabname encryption, the authentication system could be tied into fscrypt such that
3909f4c750SMauro Carvalho Chehabexisting features like key derivation can be utilized. It should however also
4009f4c750SMauro Carvalho Chehabbe possible to use UBIFS authentication without using encryption.
4109f4c750SMauro Carvalho Chehab
4209f4c750SMauro Carvalho Chehab
4309f4c750SMauro Carvalho ChehabMTD, UBI & UBIFS
4409f4c750SMauro Carvalho Chehab----------------
4509f4c750SMauro Carvalho Chehab
4609f4c750SMauro Carvalho ChehabOn Linux, the MTD (Memory Technology Devices) subsystem provides a uniform
4709f4c750SMauro Carvalho Chehabinterface to access raw flash devices. One of the more prominent subsystems that
4809f4c750SMauro Carvalho Chehabwork on top of MTD is UBI (Unsorted Block Images). It provides volume management
4909f4c750SMauro Carvalho Chehabfor flash devices and is thus somewhat similar to LVM for block devices. In
5009f4c750SMauro Carvalho Chehabaddition, it deals with flash-specific wear-leveling and transparent I/O error
5109f4c750SMauro Carvalho Chehabhandling. UBI offers logical erase blocks (LEBs) to the layers on top of it
5209f4c750SMauro Carvalho Chehaband maps them transparently to physical erase blocks (PEBs) on the flash.
5309f4c750SMauro Carvalho Chehab
5409f4c750SMauro Carvalho ChehabUBIFS is a filesystem for raw flash which operates on top of UBI. Thus, wear
5509f4c750SMauro Carvalho Chehableveling and some flash specifics are left to UBI, while UBIFS focuses on
5609f4c750SMauro Carvalho Chehabscalability, performance and recoverability.
5709f4c750SMauro Carvalho Chehab
5809f4c750SMauro Carvalho Chehab::
5909f4c750SMauro Carvalho Chehab
6009f4c750SMauro Carvalho Chehab	+------------+ +*******+ +-----------+ +-----+
6109f4c750SMauro Carvalho Chehab	|            | * UBIFS * | UBI-BLOCK | | ... |
6209f4c750SMauro Carvalho Chehab	| JFFS/JFFS2 | +*******+ +-----------+ +-----+
6309f4c750SMauro Carvalho Chehab	|            | +-----------------------------+ +-----------+ +-----+
6409f4c750SMauro Carvalho Chehab	|            | |              UBI            | | MTD-BLOCK | | ... |
6509f4c750SMauro Carvalho Chehab	+------------+ +-----------------------------+ +-----------+ +-----+
6609f4c750SMauro Carvalho Chehab	+------------------------------------------------------------------+
6709f4c750SMauro Carvalho Chehab	|                  MEMORY TECHNOLOGY DEVICES (MTD)                 |
6809f4c750SMauro Carvalho Chehab	+------------------------------------------------------------------+
6909f4c750SMauro Carvalho Chehab	+-----------------------------+ +--------------------------+ +-----+
7009f4c750SMauro Carvalho Chehab	|         NAND DRIVERS        | |        NOR DRIVERS       | | ... |
7109f4c750SMauro Carvalho Chehab	+-----------------------------+ +--------------------------+ +-----+
7209f4c750SMauro Carvalho Chehab
7309f4c750SMauro Carvalho Chehab            Figure 1: Linux kernel subsystems for dealing with raw flash
7409f4c750SMauro Carvalho Chehab
7509f4c750SMauro Carvalho Chehab
7609f4c750SMauro Carvalho Chehab
7709f4c750SMauro Carvalho ChehabInternally, UBIFS maintains multiple data structures which are persisted on
7809f4c750SMauro Carvalho Chehabthe flash:
7909f4c750SMauro Carvalho Chehab
8009f4c750SMauro Carvalho Chehab- *Index*: an on-flash B+ tree where the leaf nodes contain filesystem data
8109f4c750SMauro Carvalho Chehab- *Journal*: an additional data structure to collect FS changes before updating
8209f4c750SMauro Carvalho Chehab  the on-flash index and reduce flash wear.
8309f4c750SMauro Carvalho Chehab- *Tree Node Cache (TNC)*: an in-memory B+ tree that reflects the current FS
8409f4c750SMauro Carvalho Chehab  state to avoid frequent flash reads. It is basically the in-memory
8509f4c750SMauro Carvalho Chehab  representation of the index, but contains additional attributes.
8609f4c750SMauro Carvalho Chehab- *LEB property tree (LPT)*: an on-flash B+ tree for free space accounting per
8709f4c750SMauro Carvalho Chehab  UBI LEB.
8809f4c750SMauro Carvalho Chehab
8909f4c750SMauro Carvalho ChehabIn the remainder of this section we will cover the on-flash UBIFS data
9009f4c750SMauro Carvalho Chehabstructures in more detail. The TNC is of less importance here since it is never
9109f4c750SMauro Carvalho Chehabpersisted onto the flash directly. More details on UBIFS can also be found in
9209f4c750SMauro Carvalho Chehab[UBIFS-WP].
9309f4c750SMauro Carvalho Chehab
9409f4c750SMauro Carvalho Chehab
9509f4c750SMauro Carvalho ChehabUBIFS Index & Tree Node Cache
9609f4c750SMauro Carvalho Chehab~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
9709f4c750SMauro Carvalho Chehab
9809f4c750SMauro Carvalho ChehabBasic on-flash UBIFS entities are called *nodes*. UBIFS knows different types
99688f118eSMauro Carvalho Chehabof nodes. Eg. data nodes (``struct ubifs_data_node``) which store chunks of file
100688f118eSMauro Carvalho Chehabcontents or inode nodes (``struct ubifs_ino_node``) which represent VFS inodes.
101688f118eSMauro Carvalho ChehabAlmost all types of nodes share a common header (``ubifs_ch``) containing basic
10209f4c750SMauro Carvalho Chehabinformation like node type, node length, a sequence number, etc. (see
103688f118eSMauro Carvalho Chehab``fs/ubifs/ubifs-media.h`` in kernel source). Exceptions are entries of the LPT
10409f4c750SMauro Carvalho Chehaband some less important node types like padding nodes which are used to pad
10509f4c750SMauro Carvalho Chehabunusable content at the end of LEBs.
10609f4c750SMauro Carvalho Chehab
10709f4c750SMauro Carvalho ChehabTo avoid re-writing the whole B+ tree on every single change, it is implemented
10809f4c750SMauro Carvalho Chehabas *wandering tree*, where only the changed nodes are re-written and previous
10909f4c750SMauro Carvalho Chehabversions of them are obsoleted without erasing them right away. As a result,
11009f4c750SMauro Carvalho Chehabthe index is not stored in a single place on the flash, but *wanders* around
11109f4c750SMauro Carvalho Chehaband there are obsolete parts on the flash as long as the LEB containing them is
11209f4c750SMauro Carvalho Chehabnot reused by UBIFS. To find the most recent version of the index, UBIFS stores
11309f4c750SMauro Carvalho Chehaba special node called *master node* into UBI LEB 1 which always points to the
11409f4c750SMauro Carvalho Chehabmost recent root node of the UBIFS index. For recoverability, the master node
11509f4c750SMauro Carvalho Chehabis additionally duplicated to LEB 2. Mounting UBIFS is thus a simple read of
11609f4c750SMauro Carvalho ChehabLEB 1 and 2 to get the current master node and from there get the location of
11709f4c750SMauro Carvalho Chehabthe most recent on-flash index.
11809f4c750SMauro Carvalho Chehab
11909f4c750SMauro Carvalho ChehabThe TNC is the in-memory representation of the on-flash index. It contains some
12009f4c750SMauro Carvalho Chehabadditional runtime attributes per node which are not persisted. One of these is
12109f4c750SMauro Carvalho Chehaba dirty-flag which marks nodes that have to be persisted the next time the
12209f4c750SMauro Carvalho Chehabindex is written onto the flash. The TNC acts as a write-back cache and all
12309f4c750SMauro Carvalho Chehabmodifications of the on-flash index are done through the TNC. Like other caches,
12409f4c750SMauro Carvalho Chehabthe TNC does not have to mirror the full index into memory, but reads parts of
12509f4c750SMauro Carvalho Chehabit from flash whenever needed. A *commit* is the UBIFS operation of updating the
12609f4c750SMauro Carvalho Chehabon-flash filesystem structures like the index. On every commit, the TNC nodes
12709f4c750SMauro Carvalho Chehabmarked as dirty are written to the flash to update the persisted index.
12809f4c750SMauro Carvalho Chehab
12909f4c750SMauro Carvalho Chehab
13009f4c750SMauro Carvalho ChehabJournal
13109f4c750SMauro Carvalho Chehab~~~~~~~
13209f4c750SMauro Carvalho Chehab
133*d56b699dSBjorn HelgaasTo avoid wearing out the flash, the index is only persisted (*committed*) when
13409f4c750SMauro Carvalho Chehabcertain conditions are met (eg. ``fsync(2)``). The journal is used to record
13509f4c750SMauro Carvalho Chehabany changes (in form of inode nodes, data nodes etc.) between commits
13609f4c750SMauro Carvalho Chehabof the index. During mount, the journal is read from the flash and replayed
13709f4c750SMauro Carvalho Chehabonto the TNC (which will be created on-demand from the on-flash index).
13809f4c750SMauro Carvalho Chehab
13909f4c750SMauro Carvalho ChehabUBIFS reserves a bunch of LEBs just for the journal called *log area*. The
14009f4c750SMauro Carvalho Chehabamount of log area LEBs is configured on filesystem creation (using
14109f4c750SMauro Carvalho Chehab``mkfs.ubifs``) and stored in the superblock node. The log area contains only
14209f4c750SMauro Carvalho Chehabtwo types of nodes: *reference nodes* and *commit start nodes*. A commit start
14309f4c750SMauro Carvalho Chehabnode is written whenever an index commit is performed. Reference nodes are
14409f4c750SMauro Carvalho Chehabwritten on every journal update. Each reference node points to the position of
14509f4c750SMauro Carvalho Chehabother nodes (inode nodes, data nodes etc.) on the flash that are part of this
14609f4c750SMauro Carvalho Chehabjournal entry. These nodes are called *buds* and describe the actual filesystem
14709f4c750SMauro Carvalho Chehabchanges including their data.
14809f4c750SMauro Carvalho Chehab
14909f4c750SMauro Carvalho ChehabThe log area is maintained as a ring. Whenever the journal is almost full,
15009f4c750SMauro Carvalho Chehaba commit is initiated. This also writes a commit start node so that during
15109f4c750SMauro Carvalho Chehabmount, UBIFS will seek for the most recent commit start node and just replay
15209f4c750SMauro Carvalho Chehabevery reference node after that. Every reference node before the commit start
15309f4c750SMauro Carvalho Chehabnode will be ignored as they are already part of the on-flash index.
15409f4c750SMauro Carvalho Chehab
15509f4c750SMauro Carvalho ChehabWhen writing a journal entry, UBIFS first ensures that enough space is
15609f4c750SMauro Carvalho Chehabavailable to write the reference node and buds part of this entry. Then, the
15709f4c750SMauro Carvalho Chehabreference node is written and afterwards the buds describing the file changes.
15809f4c750SMauro Carvalho ChehabOn replay, UBIFS will record every reference node and inspect the location of
15909f4c750SMauro Carvalho Chehabthe referenced LEBs to discover the buds. If these are corrupt or missing,
16009f4c750SMauro Carvalho ChehabUBIFS will attempt to recover them by re-reading the LEB. This is however only
16109f4c750SMauro Carvalho Chehabdone for the last referenced LEB of the journal. Only this can become corrupt
16209f4c750SMauro Carvalho Chehabbecause of a power cut. If the recovery fails, UBIFS will not mount. An error
16309f4c750SMauro Carvalho Chehabfor every other LEB will directly cause UBIFS to fail the mount operation.
16409f4c750SMauro Carvalho Chehab
16509f4c750SMauro Carvalho Chehab::
16609f4c750SMauro Carvalho Chehab
16709f4c750SMauro Carvalho Chehab       | ----    LOG AREA     ---- | ----------    MAIN AREA    ------------ |
16809f4c750SMauro Carvalho Chehab
16909f4c750SMauro Carvalho Chehab        -----+------+-----+--------+----   ------+-----+-----+---------------
17009f4c750SMauro Carvalho Chehab        \    |      |     |        |   /  /      |     |     |               \
17109f4c750SMauro Carvalho Chehab        / CS |  REF | REF |        |   \  \ DENT | INO | INO |               /
17209f4c750SMauro Carvalho Chehab        \    |      |     |        |   /  /      |     |     |               \
17309f4c750SMauro Carvalho Chehab         ----+------+-----+--------+---   -------+-----+-----+----------------
17409f4c750SMauro Carvalho Chehab                 |     |                  ^            ^
17509f4c750SMauro Carvalho Chehab                 |     |                  |            |
17609f4c750SMauro Carvalho Chehab                 +------------------------+            |
17709f4c750SMauro Carvalho Chehab                       |                               |
17809f4c750SMauro Carvalho Chehab                       +-------------------------------+
17909f4c750SMauro Carvalho Chehab
18009f4c750SMauro Carvalho Chehab
18109f4c750SMauro Carvalho Chehab                Figure 2: UBIFS flash layout of log area with commit start nodes
18209f4c750SMauro Carvalho Chehab                          (CS) and reference nodes (REF) pointing to main area
18309f4c750SMauro Carvalho Chehab                          containing their buds
18409f4c750SMauro Carvalho Chehab
18509f4c750SMauro Carvalho Chehab
18609f4c750SMauro Carvalho ChehabLEB Property Tree/Table
18709f4c750SMauro Carvalho Chehab~~~~~~~~~~~~~~~~~~~~~~~
18809f4c750SMauro Carvalho Chehab
18909f4c750SMauro Carvalho ChehabThe LEB property tree is used to store per-LEB information. This includes the
19009f4c750SMauro Carvalho ChehabLEB type and amount of free and *dirty* (old, obsolete content) space [1]_ on
19109f4c750SMauro Carvalho Chehabthe LEB. The type is important, because UBIFS never mixes index nodes with data
19209f4c750SMauro Carvalho Chehabnodes on a single LEB and thus each LEB has a specific purpose. This again is
19309f4c750SMauro Carvalho Chehabuseful for free space calculations. See [UBIFS-WP] for more details.
19409f4c750SMauro Carvalho Chehab
19509f4c750SMauro Carvalho ChehabThe LEB property tree again is a B+ tree, but it is much smaller than the
19609f4c750SMauro Carvalho Chehabindex. Due to its smaller size it is always written as one chunk on every
19709f4c750SMauro Carvalho Chehabcommit. Thus, saving the LPT is an atomic operation.
19809f4c750SMauro Carvalho Chehab
19909f4c750SMauro Carvalho Chehab
20009f4c750SMauro Carvalho Chehab.. [1] Since LEBs can only be appended and never overwritten, there is a
20109f4c750SMauro Carvalho Chehab   difference between free space ie. the remaining space left on the LEB to be
20209f4c750SMauro Carvalho Chehab   written to without erasing it and previously written content that is obsolete
20309f4c750SMauro Carvalho Chehab   but can't be overwritten without erasing the full LEB.
20409f4c750SMauro Carvalho Chehab
20509f4c750SMauro Carvalho Chehab
20609f4c750SMauro Carvalho ChehabUBIFS Authentication
20709f4c750SMauro Carvalho Chehab====================
20809f4c750SMauro Carvalho Chehab
20909f4c750SMauro Carvalho ChehabThis chapter introduces UBIFS authentication which enables UBIFS to verify
21009f4c750SMauro Carvalho Chehabthe authenticity and integrity of metadata and file contents stored on flash.
21109f4c750SMauro Carvalho Chehab
21209f4c750SMauro Carvalho Chehab
21309f4c750SMauro Carvalho ChehabThreat Model
21409f4c750SMauro Carvalho Chehab------------
21509f4c750SMauro Carvalho Chehab
21609f4c750SMauro Carvalho ChehabUBIFS authentication enables detection of offline data modification. While it
21709f4c750SMauro Carvalho Chehabdoes not prevent it, it enables (trusted) code to check the integrity and
21809f4c750SMauro Carvalho Chehabauthenticity of on-flash file contents and filesystem metadata. This covers
21909f4c750SMauro Carvalho Chehabattacks where file contents are swapped.
22009f4c750SMauro Carvalho Chehab
22109f4c750SMauro Carvalho ChehabUBIFS authentication will not protect against rollback of full flash contents.
22209f4c750SMauro Carvalho ChehabIe. an attacker can still dump the flash and restore it at a later time without
22309f4c750SMauro Carvalho Chehabdetection. It will also not protect against partial rollback of individual
22409f4c750SMauro Carvalho Chehabindex commits. That means that an attacker is able to partially undo changes.
22509f4c750SMauro Carvalho ChehabThis is possible because UBIFS does not immediately overwrites obsolete
22609f4c750SMauro Carvalho Chehabversions of the index tree or the journal, but instead marks them as obsolete
22709f4c750SMauro Carvalho Chehaband garbage collection erases them at a later time. An attacker can use this by
22809f4c750SMauro Carvalho Chehaberasing parts of the current tree and restoring old versions that are still on
22909f4c750SMauro Carvalho Chehabthe flash and have not yet been erased. This is possible, because every commit
23009f4c750SMauro Carvalho Chehabwill always write a new version of the index root node and the master node
23109f4c750SMauro Carvalho Chehabwithout overwriting the previous version. This is further helped by the
23209f4c750SMauro Carvalho Chehabwear-leveling operations of UBI which copies contents from one physical
23309f4c750SMauro Carvalho Chehaberaseblock to another and does not atomically erase the first eraseblock.
23409f4c750SMauro Carvalho Chehab
23509f4c750SMauro Carvalho ChehabUBIFS authentication does not cover attacks where an attacker is able to
23609f4c750SMauro Carvalho Chehabexecute code on the device after the authentication key was provided.
23709f4c750SMauro Carvalho ChehabAdditional measures like secure boot and trusted boot have to be taken to
23809f4c750SMauro Carvalho Chehabensure that only trusted code is executed on a device.
23909f4c750SMauro Carvalho Chehab
24009f4c750SMauro Carvalho Chehab
24109f4c750SMauro Carvalho ChehabAuthentication
24209f4c750SMauro Carvalho Chehab--------------
24309f4c750SMauro Carvalho Chehab
24409f4c750SMauro Carvalho ChehabTo be able to fully trust data read from flash, all UBIFS data structures
24509f4c750SMauro Carvalho Chehabstored on flash are authenticated. That is:
24609f4c750SMauro Carvalho Chehab
24709f4c750SMauro Carvalho Chehab- The index which includes file contents, file metadata like extended
24809f4c750SMauro Carvalho Chehab  attributes, file length etc.
24909f4c750SMauro Carvalho Chehab- The journal which also contains file contents and metadata by recording changes
25009f4c750SMauro Carvalho Chehab  to the filesystem
25109f4c750SMauro Carvalho Chehab- The LPT which stores UBI LEB metadata which UBIFS uses for free space accounting
25209f4c750SMauro Carvalho Chehab
25309f4c750SMauro Carvalho Chehab
25409f4c750SMauro Carvalho ChehabIndex Authentication
25509f4c750SMauro Carvalho Chehab~~~~~~~~~~~~~~~~~~~~
25609f4c750SMauro Carvalho Chehab
25709f4c750SMauro Carvalho ChehabThrough UBIFS' concept of a wandering tree, it already takes care of only
25809f4c750SMauro Carvalho Chehabupdating and persisting changed parts from leaf node up to the root node
25909f4c750SMauro Carvalho Chehabof the full B+ tree. This enables us to augment the index nodes of the tree
26009f4c750SMauro Carvalho Chehabwith a hash over each node's child nodes. As a result, the index basically also
26109f4c750SMauro Carvalho Chehaba Merkle tree. Since the leaf nodes of the index contain the actual filesystem
26209f4c750SMauro Carvalho Chehabdata, the hashes of their parent index nodes thus cover all the file contents
26309f4c750SMauro Carvalho Chehaband file metadata. When a file changes, the UBIFS index is updated accordingly
26409f4c750SMauro Carvalho Chehabfrom the leaf nodes up to the root node including the master node. This process
26509f4c750SMauro Carvalho Chehabcan be hooked to recompute the hash only for each changed node at the same time.
26609f4c750SMauro Carvalho ChehabWhenever a file is read, UBIFS can verify the hashes from each leaf node up to
26709f4c750SMauro Carvalho Chehabthe root node to ensure the node's integrity.
26809f4c750SMauro Carvalho Chehab
26909f4c750SMauro Carvalho ChehabTo ensure the authenticity of the whole index, the UBIFS master node stores a
27009f4c750SMauro Carvalho Chehabkeyed hash (HMAC) over its own contents and a hash of the root node of the index
27109f4c750SMauro Carvalho Chehabtree. As mentioned above, the master node is always written to the flash whenever
27209f4c750SMauro Carvalho Chehabthe index is persisted (ie. on index commit).
27309f4c750SMauro Carvalho Chehab
27409f4c750SMauro Carvalho ChehabUsing this approach only UBIFS index nodes and the master node are changed to
27509f4c750SMauro Carvalho Chehabinclude a hash. All other types of nodes will remain unchanged. This reduces
27609f4c750SMauro Carvalho Chehabthe storage overhead which is precious for users of UBIFS (ie. embedded
27709f4c750SMauro Carvalho Chehabdevices).
27809f4c750SMauro Carvalho Chehab
27909f4c750SMauro Carvalho Chehab::
28009f4c750SMauro Carvalho Chehab
28109f4c750SMauro Carvalho Chehab                             +---------------+
28209f4c750SMauro Carvalho Chehab                             |  Master Node  |
28309f4c750SMauro Carvalho Chehab                             |    (hash)     |
28409f4c750SMauro Carvalho Chehab                             +---------------+
28509f4c750SMauro Carvalho Chehab                                     |
28609f4c750SMauro Carvalho Chehab                                     v
28709f4c750SMauro Carvalho Chehab                            +-------------------+
28809f4c750SMauro Carvalho Chehab                            |  Index Node #1    |
28909f4c750SMauro Carvalho Chehab                            |                   |
29009f4c750SMauro Carvalho Chehab                            | branch0   branchn |
29109f4c750SMauro Carvalho Chehab                            | (hash)    (hash)  |
29209f4c750SMauro Carvalho Chehab                            +-------------------+
29309f4c750SMauro Carvalho Chehab                               |    ...   |  (fanout: 8)
29409f4c750SMauro Carvalho Chehab                               |          |
29509f4c750SMauro Carvalho Chehab                       +-------+          +------+
29609f4c750SMauro Carvalho Chehab                       |                         |
29709f4c750SMauro Carvalho Chehab                       v                         v
29809f4c750SMauro Carvalho Chehab            +-------------------+       +-------------------+
29909f4c750SMauro Carvalho Chehab            |  Index Node #2    |       |  Index Node #3    |
30009f4c750SMauro Carvalho Chehab            |                   |       |                   |
30109f4c750SMauro Carvalho Chehab            | branch0   branchn |       | branch0   branchn |
30209f4c750SMauro Carvalho Chehab            | (hash)    (hash)  |       | (hash)    (hash)  |
30309f4c750SMauro Carvalho Chehab            +-------------------+       +-------------------+
30409f4c750SMauro Carvalho Chehab                 |   ...                     |   ...   |
30509f4c750SMauro Carvalho Chehab                 v                           v         v
30609f4c750SMauro Carvalho Chehab               +-----------+         +----------+  +-----------+
30709f4c750SMauro Carvalho Chehab               | Data Node |         | INO Node |  | DENT Node |
30809f4c750SMauro Carvalho Chehab               +-----------+         +----------+  +-----------+
30909f4c750SMauro Carvalho Chehab
31009f4c750SMauro Carvalho Chehab
31109f4c750SMauro Carvalho Chehab           Figure 3: Coverage areas of index node hash and master node HMAC
31209f4c750SMauro Carvalho Chehab
31309f4c750SMauro Carvalho Chehab
31409f4c750SMauro Carvalho Chehab
31509f4c750SMauro Carvalho ChehabThe most important part for robustness and power-cut safety is to atomically
31609f4c750SMauro Carvalho Chehabpersist the hash and file contents. Here the existing UBIFS logic for how
31709f4c750SMauro Carvalho Chehabchanged nodes are persisted is already designed for this purpose such that
31809f4c750SMauro Carvalho ChehabUBIFS can safely recover if a power-cut occurs while persisting. Adding
31909f4c750SMauro Carvalho Chehabhashes to index nodes does not change this since each hash will be persisted
32009f4c750SMauro Carvalho Chehabatomically together with its respective node.
32109f4c750SMauro Carvalho Chehab
32209f4c750SMauro Carvalho Chehab
32309f4c750SMauro Carvalho ChehabJournal Authentication
32409f4c750SMauro Carvalho Chehab~~~~~~~~~~~~~~~~~~~~~~
32509f4c750SMauro Carvalho Chehab
32609f4c750SMauro Carvalho ChehabThe journal is authenticated too. Since the journal is continuously written
32709f4c750SMauro Carvalho Chehabit is necessary to also add authentication information frequently to the
32809f4c750SMauro Carvalho Chehabjournal so that in case of a powercut not too much data can't be authenticated.
32909f4c750SMauro Carvalho ChehabThis is done by creating a continuous hash beginning from the commit start node
33009f4c750SMauro Carvalho Chehabover the previous reference nodes, the current reference node, and the bud
33109f4c750SMauro Carvalho Chehabnodes. From time to time whenever it is suitable authentication nodes are added
33209f4c750SMauro Carvalho Chehabbetween the bud nodes. This new node type contains a HMAC over the current state
33309f4c750SMauro Carvalho Chehabof the hash chain. That way a journal can be authenticated up to the last
33409f4c750SMauro Carvalho Chehabauthentication node. The tail of the journal which may not have a authentication
33509f4c750SMauro Carvalho Chehabnode cannot be authenticated and is skipped during journal replay.
33609f4c750SMauro Carvalho Chehab
33709f4c750SMauro Carvalho ChehabWe get this picture for journal authentication::
33809f4c750SMauro Carvalho Chehab
33909f4c750SMauro Carvalho Chehab    ,,,,,,,,
34009f4c750SMauro Carvalho Chehab    ,......,...........................................
34109f4c750SMauro Carvalho Chehab    ,. CS  ,               hash1.----.           hash2.----.
34209f4c750SMauro Carvalho Chehab    ,.  |  ,                    .    |hmac            .    |hmac
34309f4c750SMauro Carvalho Chehab    ,.  v  ,                    .    v                .    v
34409f4c750SMauro Carvalho Chehab    ,.REF#0,-> bud -> bud -> bud.-> auth -> bud -> bud.-> auth ...
34509f4c750SMauro Carvalho Chehab    ,..|...,...........................................
34609f4c750SMauro Carvalho Chehab    ,  |   ,
34709f4c750SMauro Carvalho Chehab    ,  |   ,,,,,,,,,,,,,,,
34809f4c750SMauro Carvalho Chehab    .  |            hash3,----.
34909f4c750SMauro Carvalho Chehab    ,  |                 ,    |hmac
35009f4c750SMauro Carvalho Chehab    ,  v                 ,    v
35109f4c750SMauro Carvalho Chehab    , REF#1 -> bud -> bud,-> auth ...
35209f4c750SMauro Carvalho Chehab    ,,,|,,,,,,,,,,,,,,,,,,
35309f4c750SMauro Carvalho Chehab       v
35409f4c750SMauro Carvalho Chehab      REF#2 -> ...
35509f4c750SMauro Carvalho Chehab       |
35609f4c750SMauro Carvalho Chehab       V
35709f4c750SMauro Carvalho Chehab      ...
35809f4c750SMauro Carvalho Chehab
35909f4c750SMauro Carvalho ChehabSince the hash also includes the reference nodes an attacker cannot reorder or
36009f4c750SMauro Carvalho Chehabskip any journal heads for replay. An attacker can only remove bud nodes or
36109f4c750SMauro Carvalho Chehabreference nodes from the end of the journal, effectively rewinding the
36209f4c750SMauro Carvalho Chehabfilesystem at maximum back to the last commit.
36309f4c750SMauro Carvalho Chehab
36409f4c750SMauro Carvalho ChehabThe location of the log area is stored in the master node. Since the master
36509f4c750SMauro Carvalho Chehabnode is authenticated with a HMAC as described above, it is not possible to
36609f4c750SMauro Carvalho Chehabtamper with that without detection. The size of the log area is specified when
36709f4c750SMauro Carvalho Chehabthe filesystem is created using `mkfs.ubifs` and stored in the superblock node.
36809f4c750SMauro Carvalho ChehabTo avoid tampering with this and other values stored there, a HMAC is added to
36909f4c750SMauro Carvalho Chehabthe superblock struct. The superblock node is stored in LEB 0 and is only
37009f4c750SMauro Carvalho Chehabmodified on feature flag or similar changes, but never on file changes.
37109f4c750SMauro Carvalho Chehab
37209f4c750SMauro Carvalho Chehab
37309f4c750SMauro Carvalho ChehabLPT Authentication
37409f4c750SMauro Carvalho Chehab~~~~~~~~~~~~~~~~~~
37509f4c750SMauro Carvalho Chehab
37609f4c750SMauro Carvalho ChehabThe location of the LPT root node on the flash is stored in the UBIFS master
37709f4c750SMauro Carvalho Chehabnode. Since the LPT is written and read atomically on every commit, there is
37809f4c750SMauro Carvalho Chehabno need to authenticate individual nodes of the tree. It suffices to
37909f4c750SMauro Carvalho Chehabprotect the integrity of the full LPT by a simple hash stored in the master
38009f4c750SMauro Carvalho Chehabnode. Since the master node itself is authenticated, the LPTs authenticity can
38109f4c750SMauro Carvalho Chehabbe verified by verifying the authenticity of the master node and comparing the
38209f4c750SMauro Carvalho ChehabLTP hash stored there with the hash computed from the read on-flash LPT.
38309f4c750SMauro Carvalho Chehab
38409f4c750SMauro Carvalho Chehab
38509f4c750SMauro Carvalho ChehabKey Management
38609f4c750SMauro Carvalho Chehab--------------
38709f4c750SMauro Carvalho Chehab
38809f4c750SMauro Carvalho ChehabFor simplicity, UBIFS authentication uses a single key to compute the HMACs
38909f4c750SMauro Carvalho Chehabof superblock, master, commit start and reference nodes. This key has to be
39009f4c750SMauro Carvalho Chehabavailable on creation of the filesystem (`mkfs.ubifs`) to authenticate the
39109f4c750SMauro Carvalho Chehabsuperblock node. Further, it has to be available on mount of the filesystem
39209f4c750SMauro Carvalho Chehabto verify authenticated nodes and generate new HMACs for changes.
39309f4c750SMauro Carvalho Chehab
39409f4c750SMauro Carvalho ChehabUBIFS authentication is intended to operate side-by-side with UBIFS encryption
39509f4c750SMauro Carvalho Chehab(fscrypt) to provide confidentiality and authenticity. Since UBIFS encryption
39609f4c750SMauro Carvalho Chehabhas a different approach of encryption policies per directory, there can be
39709f4c750SMauro Carvalho Chehabmultiple fscrypt master keys and there might be folders without encryption.
39809f4c750SMauro Carvalho ChehabUBIFS authentication on the other hand has an all-or-nothing approach in the
39909f4c750SMauro Carvalho Chehabsense that it either authenticates everything of the filesystem or nothing.
40009f4c750SMauro Carvalho ChehabBecause of this and because UBIFS authentication should also be usable without
40109f4c750SMauro Carvalho Chehabencryption, it does not share the same master key with fscrypt, but manages
40209f4c750SMauro Carvalho Chehaba dedicated authentication key.
40309f4c750SMauro Carvalho Chehab
40409f4c750SMauro Carvalho ChehabThe API for providing the authentication key has yet to be defined, but the
40509f4c750SMauro Carvalho Chehabkey can eg. be provided by userspace through a keyring similar to the way it
40609f4c750SMauro Carvalho Chehabis currently done in fscrypt. It should however be noted that the current
40709f4c750SMauro Carvalho Chehabfscrypt approach has shown its flaws and the userspace API will eventually
40809f4c750SMauro Carvalho Chehabchange [FSCRYPT-POLICY2].
40909f4c750SMauro Carvalho Chehab
41009f4c750SMauro Carvalho ChehabNevertheless, it will be possible for a user to provide a single passphrase
41109f4c750SMauro Carvalho Chehabor key in userspace that covers UBIFS authentication and encryption. This can
41209f4c750SMauro Carvalho Chehabbe solved by the corresponding userspace tools which derive a second key for
41309f4c750SMauro Carvalho Chehabauthentication in addition to the derived fscrypt master key used for
41409f4c750SMauro Carvalho Chehabencryption.
41509f4c750SMauro Carvalho Chehab
41609f4c750SMauro Carvalho ChehabTo be able to check if the proper key is available on mount, the UBIFS
41709f4c750SMauro Carvalho Chehabsuperblock node will additionally store a hash of the authentication key. This
41809f4c750SMauro Carvalho Chehabapproach is similar to the approach proposed for fscrypt encryption policy v2
41909f4c750SMauro Carvalho Chehab[FSCRYPT-POLICY2].
42009f4c750SMauro Carvalho Chehab
42109f4c750SMauro Carvalho Chehab
42209f4c750SMauro Carvalho ChehabFuture Extensions
42309f4c750SMauro Carvalho Chehab=================
42409f4c750SMauro Carvalho Chehab
42509f4c750SMauro Carvalho ChehabIn certain cases where a vendor wants to provide an authenticated filesystem
42609f4c750SMauro Carvalho Chehabimage to customers, it should be possible to do so without sharing the secret
42709f4c750SMauro Carvalho ChehabUBIFS authentication key. Instead, in addition the each HMAC a digital
42809f4c750SMauro Carvalho Chehabsignature could be stored where the vendor shares the public key alongside the
42909f4c750SMauro Carvalho Chehabfilesystem image. In case this filesystem has to be modified afterwards,
43009f4c750SMauro Carvalho ChehabUBIFS can exchange all digital signatures with HMACs on first mount similar
43109f4c750SMauro Carvalho Chehabto the way the IMA/EVM subsystem deals with such situations. The HMAC key
43209f4c750SMauro Carvalho Chehabwill then have to be provided beforehand in the normal way.
43309f4c750SMauro Carvalho Chehab
43409f4c750SMauro Carvalho Chehab
43509f4c750SMauro Carvalho ChehabReferences
43609f4c750SMauro Carvalho Chehab==========
43709f4c750SMauro Carvalho Chehab
438c69f22f2SAlexander A. Klimov[CRYPTSETUP2]        https://www.saout.de/pipermail/dm-crypt/2017-November/005745.html
43909f4c750SMauro Carvalho Chehab
440c69f22f2SAlexander A. Klimov[DMC-CBC-ATTACK]     https://www.jakoblell.com/blog/2013/12/22/practical-malleability-attack-against-cbc-encrypted-luks-partitions/
44109f4c750SMauro Carvalho Chehab
44209f4c750SMauro Carvalho Chehab[DM-INTEGRITY]       https://www.kernel.org/doc/Documentation/device-mapper/dm-integrity.rst
44309f4c750SMauro Carvalho Chehab
44409f4c750SMauro Carvalho Chehab[DM-VERITY]          https://www.kernel.org/doc/Documentation/device-mapper/verity.rst
44509f4c750SMauro Carvalho Chehab
44609f4c750SMauro Carvalho Chehab[FSCRYPT-POLICY2]    https://www.spinics.net/lists/linux-ext4/msg58710.html
44709f4c750SMauro Carvalho Chehab
44809f4c750SMauro Carvalho Chehab[UBIFS-WP]           http://www.linux-mtd.infradead.org/doc/ubifs_whitepaper.pdf
449