xref: /openbmc/linux/Documentation/filesystems/ubifs-authentication.rst (revision 09f4c750a8c7d1fc0b7bb3a7aa1de55de897a375)
1*09f4c750SMauro Carvalho Chehab:orphan:
2*09f4c750SMauro Carvalho Chehab
3*09f4c750SMauro Carvalho Chehab.. UBIFS Authentication
4*09f4c750SMauro Carvalho Chehab.. sigma star gmbh
5*09f4c750SMauro Carvalho Chehab.. 2018
6*09f4c750SMauro Carvalho Chehab
7*09f4c750SMauro Carvalho ChehabIntroduction
8*09f4c750SMauro Carvalho Chehab============
9*09f4c750SMauro Carvalho Chehab
10*09f4c750SMauro Carvalho ChehabUBIFS utilizes the fscrypt framework to provide confidentiality for file
11*09f4c750SMauro Carvalho Chehabcontents and file names. This prevents attacks where an attacker is able to
12*09f4c750SMauro Carvalho Chehabread contents of the filesystem on a single point in time. A classic example
13*09f4c750SMauro Carvalho Chehabis a lost smartphone where the attacker is unable to read personal data stored
14*09f4c750SMauro Carvalho Chehabon the device without the filesystem decryption key.
15*09f4c750SMauro Carvalho Chehab
16*09f4c750SMauro Carvalho ChehabAt the current state, UBIFS encryption however does not prevent attacks where
17*09f4c750SMauro Carvalho Chehabthe attacker is able to modify the filesystem contents and the user uses the
18*09f4c750SMauro Carvalho Chehabdevice afterwards. In such a scenario an attacker can modify filesystem
19*09f4c750SMauro Carvalho Chehabcontents arbitrarily without the user noticing. One example is to modify a
20*09f4c750SMauro Carvalho Chehabbinary to perform a malicious action when executed [DMC-CBC-ATTACK]. Since
21*09f4c750SMauro Carvalho Chehabmost of the filesystem metadata of UBIFS is stored in plain, this makes it
22*09f4c750SMauro Carvalho Chehabfairly easy to swap files and replace their contents.
23*09f4c750SMauro Carvalho Chehab
24*09f4c750SMauro Carvalho ChehabOther full disk encryption systems like dm-crypt cover all filesystem metadata,
25*09f4c750SMauro Carvalho Chehabwhich makes such kinds of attacks more complicated, but not impossible.
26*09f4c750SMauro Carvalho ChehabEspecially, if the attacker is given access to the device multiple points in
27*09f4c750SMauro Carvalho Chehabtime. For dm-crypt and other filesystems that build upon the Linux block IO
28*09f4c750SMauro Carvalho Chehablayer, the dm-integrity or dm-verity subsystems [DM-INTEGRITY, DM-VERITY]
29*09f4c750SMauro Carvalho Chehabcan be used to get full data authentication at the block layer.
30*09f4c750SMauro Carvalho ChehabThese can also be combined with dm-crypt [CRYPTSETUP2].
31*09f4c750SMauro Carvalho Chehab
32*09f4c750SMauro Carvalho ChehabThis document describes an approach to get file contents _and_ full metadata
33*09f4c750SMauro Carvalho Chehabauthentication for UBIFS. Since UBIFS uses fscrypt for file contents and file
34*09f4c750SMauro Carvalho Chehabname encryption, the authentication system could be tied into fscrypt such that
35*09f4c750SMauro Carvalho Chehabexisting features like key derivation can be utilized. It should however also
36*09f4c750SMauro Carvalho Chehabbe possible to use UBIFS authentication without using encryption.
37*09f4c750SMauro Carvalho Chehab
38*09f4c750SMauro Carvalho Chehab
39*09f4c750SMauro Carvalho ChehabMTD, UBI & UBIFS
40*09f4c750SMauro Carvalho Chehab----------------
41*09f4c750SMauro Carvalho Chehab
42*09f4c750SMauro Carvalho ChehabOn Linux, the MTD (Memory Technology Devices) subsystem provides a uniform
43*09f4c750SMauro Carvalho Chehabinterface to access raw flash devices. One of the more prominent subsystems that
44*09f4c750SMauro Carvalho Chehabwork on top of MTD is UBI (Unsorted Block Images). It provides volume management
45*09f4c750SMauro Carvalho Chehabfor flash devices and is thus somewhat similar to LVM for block devices. In
46*09f4c750SMauro Carvalho Chehabaddition, it deals with flash-specific wear-leveling and transparent I/O error
47*09f4c750SMauro Carvalho Chehabhandling. UBI offers logical erase blocks (LEBs) to the layers on top of it
48*09f4c750SMauro Carvalho Chehaband maps them transparently to physical erase blocks (PEBs) on the flash.
49*09f4c750SMauro Carvalho Chehab
50*09f4c750SMauro Carvalho ChehabUBIFS is a filesystem for raw flash which operates on top of UBI. Thus, wear
51*09f4c750SMauro Carvalho Chehableveling and some flash specifics are left to UBI, while UBIFS focuses on
52*09f4c750SMauro Carvalho Chehabscalability, performance and recoverability.
53*09f4c750SMauro Carvalho Chehab
54*09f4c750SMauro Carvalho Chehab::
55*09f4c750SMauro Carvalho Chehab
56*09f4c750SMauro Carvalho Chehab	+------------+ +*******+ +-----------+ +-----+
57*09f4c750SMauro Carvalho Chehab	|            | * UBIFS * | UBI-BLOCK | | ... |
58*09f4c750SMauro Carvalho Chehab	| JFFS/JFFS2 | +*******+ +-----------+ +-----+
59*09f4c750SMauro Carvalho Chehab	|            | +-----------------------------+ +-----------+ +-----+
60*09f4c750SMauro Carvalho Chehab	|            | |              UBI            | | MTD-BLOCK | | ... |
61*09f4c750SMauro Carvalho Chehab	+------------+ +-----------------------------+ +-----------+ +-----+
62*09f4c750SMauro Carvalho Chehab	+------------------------------------------------------------------+
63*09f4c750SMauro Carvalho Chehab	|                  MEMORY TECHNOLOGY DEVICES (MTD)                 |
64*09f4c750SMauro Carvalho Chehab	+------------------------------------------------------------------+
65*09f4c750SMauro Carvalho Chehab	+-----------------------------+ +--------------------------+ +-----+
66*09f4c750SMauro Carvalho Chehab	|         NAND DRIVERS        | |        NOR DRIVERS       | | ... |
67*09f4c750SMauro Carvalho Chehab	+-----------------------------+ +--------------------------+ +-----+
68*09f4c750SMauro Carvalho Chehab
69*09f4c750SMauro Carvalho Chehab            Figure 1: Linux kernel subsystems for dealing with raw flash
70*09f4c750SMauro Carvalho Chehab
71*09f4c750SMauro Carvalho Chehab
72*09f4c750SMauro Carvalho Chehab
73*09f4c750SMauro Carvalho ChehabInternally, UBIFS maintains multiple data structures which are persisted on
74*09f4c750SMauro Carvalho Chehabthe flash:
75*09f4c750SMauro Carvalho Chehab
76*09f4c750SMauro Carvalho Chehab- *Index*: an on-flash B+ tree where the leaf nodes contain filesystem data
77*09f4c750SMauro Carvalho Chehab- *Journal*: an additional data structure to collect FS changes before updating
78*09f4c750SMauro Carvalho Chehab  the on-flash index and reduce flash wear.
79*09f4c750SMauro Carvalho Chehab- *Tree Node Cache (TNC)*: an in-memory B+ tree that reflects the current FS
80*09f4c750SMauro Carvalho Chehab  state to avoid frequent flash reads. It is basically the in-memory
81*09f4c750SMauro Carvalho Chehab  representation of the index, but contains additional attributes.
82*09f4c750SMauro Carvalho Chehab- *LEB property tree (LPT)*: an on-flash B+ tree for free space accounting per
83*09f4c750SMauro Carvalho Chehab  UBI LEB.
84*09f4c750SMauro Carvalho Chehab
85*09f4c750SMauro Carvalho ChehabIn the remainder of this section we will cover the on-flash UBIFS data
86*09f4c750SMauro Carvalho Chehabstructures in more detail. The TNC is of less importance here since it is never
87*09f4c750SMauro Carvalho Chehabpersisted onto the flash directly. More details on UBIFS can also be found in
88*09f4c750SMauro Carvalho Chehab[UBIFS-WP].
89*09f4c750SMauro Carvalho Chehab
90*09f4c750SMauro Carvalho Chehab
91*09f4c750SMauro Carvalho ChehabUBIFS Index & Tree Node Cache
92*09f4c750SMauro Carvalho Chehab~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
93*09f4c750SMauro Carvalho Chehab
94*09f4c750SMauro Carvalho ChehabBasic on-flash UBIFS entities are called *nodes*. UBIFS knows different types
95*09f4c750SMauro Carvalho Chehabof nodes. Eg. data nodes (`struct ubifs_data_node`) which store chunks of file
96*09f4c750SMauro Carvalho Chehabcontents or inode nodes (`struct ubifs_ino_node`) which represent VFS inodes.
97*09f4c750SMauro Carvalho ChehabAlmost all types of nodes share a common header (`ubifs_ch`) containing basic
98*09f4c750SMauro Carvalho Chehabinformation like node type, node length, a sequence number, etc. (see
99*09f4c750SMauro Carvalho Chehab`fs/ubifs/ubifs-media.h`in kernel source). Exceptions are entries of the LPT
100*09f4c750SMauro Carvalho Chehaband some less important node types like padding nodes which are used to pad
101*09f4c750SMauro Carvalho Chehabunusable content at the end of LEBs.
102*09f4c750SMauro Carvalho Chehab
103*09f4c750SMauro Carvalho ChehabTo avoid re-writing the whole B+ tree on every single change, it is implemented
104*09f4c750SMauro Carvalho Chehabas *wandering tree*, where only the changed nodes are re-written and previous
105*09f4c750SMauro Carvalho Chehabversions of them are obsoleted without erasing them right away. As a result,
106*09f4c750SMauro Carvalho Chehabthe index is not stored in a single place on the flash, but *wanders* around
107*09f4c750SMauro Carvalho Chehaband there are obsolete parts on the flash as long as the LEB containing them is
108*09f4c750SMauro Carvalho Chehabnot reused by UBIFS. To find the most recent version of the index, UBIFS stores
109*09f4c750SMauro Carvalho Chehaba special node called *master node* into UBI LEB 1 which always points to the
110*09f4c750SMauro Carvalho Chehabmost recent root node of the UBIFS index. For recoverability, the master node
111*09f4c750SMauro Carvalho Chehabis additionally duplicated to LEB 2. Mounting UBIFS is thus a simple read of
112*09f4c750SMauro Carvalho ChehabLEB 1 and 2 to get the current master node and from there get the location of
113*09f4c750SMauro Carvalho Chehabthe most recent on-flash index.
114*09f4c750SMauro Carvalho Chehab
115*09f4c750SMauro Carvalho ChehabThe TNC is the in-memory representation of the on-flash index. It contains some
116*09f4c750SMauro Carvalho Chehabadditional runtime attributes per node which are not persisted. One of these is
117*09f4c750SMauro Carvalho Chehaba dirty-flag which marks nodes that have to be persisted the next time the
118*09f4c750SMauro Carvalho Chehabindex is written onto the flash. The TNC acts as a write-back cache and all
119*09f4c750SMauro Carvalho Chehabmodifications of the on-flash index are done through the TNC. Like other caches,
120*09f4c750SMauro Carvalho Chehabthe TNC does not have to mirror the full index into memory, but reads parts of
121*09f4c750SMauro Carvalho Chehabit from flash whenever needed. A *commit* is the UBIFS operation of updating the
122*09f4c750SMauro Carvalho Chehabon-flash filesystem structures like the index. On every commit, the TNC nodes
123*09f4c750SMauro Carvalho Chehabmarked as dirty are written to the flash to update the persisted index.
124*09f4c750SMauro Carvalho Chehab
125*09f4c750SMauro Carvalho Chehab
126*09f4c750SMauro Carvalho ChehabJournal
127*09f4c750SMauro Carvalho Chehab~~~~~~~
128*09f4c750SMauro Carvalho Chehab
129*09f4c750SMauro Carvalho ChehabTo avoid wearing out the flash, the index is only persisted (*commited*) when
130*09f4c750SMauro Carvalho Chehabcertain conditions are met (eg. ``fsync(2)``). The journal is used to record
131*09f4c750SMauro Carvalho Chehabany changes (in form of inode nodes, data nodes etc.) between commits
132*09f4c750SMauro Carvalho Chehabof the index. During mount, the journal is read from the flash and replayed
133*09f4c750SMauro Carvalho Chehabonto the TNC (which will be created on-demand from the on-flash index).
134*09f4c750SMauro Carvalho Chehab
135*09f4c750SMauro Carvalho ChehabUBIFS reserves a bunch of LEBs just for the journal called *log area*. The
136*09f4c750SMauro Carvalho Chehabamount of log area LEBs is configured on filesystem creation (using
137*09f4c750SMauro Carvalho Chehab``mkfs.ubifs``) and stored in the superblock node. The log area contains only
138*09f4c750SMauro Carvalho Chehabtwo types of nodes: *reference nodes* and *commit start nodes*. A commit start
139*09f4c750SMauro Carvalho Chehabnode is written whenever an index commit is performed. Reference nodes are
140*09f4c750SMauro Carvalho Chehabwritten on every journal update. Each reference node points to the position of
141*09f4c750SMauro Carvalho Chehabother nodes (inode nodes, data nodes etc.) on the flash that are part of this
142*09f4c750SMauro Carvalho Chehabjournal entry. These nodes are called *buds* and describe the actual filesystem
143*09f4c750SMauro Carvalho Chehabchanges including their data.
144*09f4c750SMauro Carvalho Chehab
145*09f4c750SMauro Carvalho ChehabThe log area is maintained as a ring. Whenever the journal is almost full,
146*09f4c750SMauro Carvalho Chehaba commit is initiated. This also writes a commit start node so that during
147*09f4c750SMauro Carvalho Chehabmount, UBIFS will seek for the most recent commit start node and just replay
148*09f4c750SMauro Carvalho Chehabevery reference node after that. Every reference node before the commit start
149*09f4c750SMauro Carvalho Chehabnode will be ignored as they are already part of the on-flash index.
150*09f4c750SMauro Carvalho Chehab
151*09f4c750SMauro Carvalho ChehabWhen writing a journal entry, UBIFS first ensures that enough space is
152*09f4c750SMauro Carvalho Chehabavailable to write the reference node and buds part of this entry. Then, the
153*09f4c750SMauro Carvalho Chehabreference node is written and afterwards the buds describing the file changes.
154*09f4c750SMauro Carvalho ChehabOn replay, UBIFS will record every reference node and inspect the location of
155*09f4c750SMauro Carvalho Chehabthe referenced LEBs to discover the buds. If these are corrupt or missing,
156*09f4c750SMauro Carvalho ChehabUBIFS will attempt to recover them by re-reading the LEB. This is however only
157*09f4c750SMauro Carvalho Chehabdone for the last referenced LEB of the journal. Only this can become corrupt
158*09f4c750SMauro Carvalho Chehabbecause of a power cut. If the recovery fails, UBIFS will not mount. An error
159*09f4c750SMauro Carvalho Chehabfor every other LEB will directly cause UBIFS to fail the mount operation.
160*09f4c750SMauro Carvalho Chehab
161*09f4c750SMauro Carvalho Chehab::
162*09f4c750SMauro Carvalho Chehab
163*09f4c750SMauro Carvalho Chehab       | ----    LOG AREA     ---- | ----------    MAIN AREA    ------------ |
164*09f4c750SMauro Carvalho Chehab
165*09f4c750SMauro Carvalho Chehab        -----+------+-----+--------+----   ------+-----+-----+---------------
166*09f4c750SMauro Carvalho Chehab        \    |      |     |        |   /  /      |     |     |               \
167*09f4c750SMauro Carvalho Chehab        / CS |  REF | REF |        |   \  \ DENT | INO | INO |               /
168*09f4c750SMauro Carvalho Chehab        \    |      |     |        |   /  /      |     |     |               \
169*09f4c750SMauro Carvalho Chehab         ----+------+-----+--------+---   -------+-----+-----+----------------
170*09f4c750SMauro Carvalho Chehab                 |     |                  ^            ^
171*09f4c750SMauro Carvalho Chehab                 |     |                  |            |
172*09f4c750SMauro Carvalho Chehab                 +------------------------+            |
173*09f4c750SMauro Carvalho Chehab                       |                               |
174*09f4c750SMauro Carvalho Chehab                       +-------------------------------+
175*09f4c750SMauro Carvalho Chehab
176*09f4c750SMauro Carvalho Chehab
177*09f4c750SMauro Carvalho Chehab                Figure 2: UBIFS flash layout of log area with commit start nodes
178*09f4c750SMauro Carvalho Chehab                          (CS) and reference nodes (REF) pointing to main area
179*09f4c750SMauro Carvalho Chehab                          containing their buds
180*09f4c750SMauro Carvalho Chehab
181*09f4c750SMauro Carvalho Chehab
182*09f4c750SMauro Carvalho ChehabLEB Property Tree/Table
183*09f4c750SMauro Carvalho Chehab~~~~~~~~~~~~~~~~~~~~~~~
184*09f4c750SMauro Carvalho Chehab
185*09f4c750SMauro Carvalho ChehabThe LEB property tree is used to store per-LEB information. This includes the
186*09f4c750SMauro Carvalho ChehabLEB type and amount of free and *dirty* (old, obsolete content) space [1]_ on
187*09f4c750SMauro Carvalho Chehabthe LEB. The type is important, because UBIFS never mixes index nodes with data
188*09f4c750SMauro Carvalho Chehabnodes on a single LEB and thus each LEB has a specific purpose. This again is
189*09f4c750SMauro Carvalho Chehabuseful for free space calculations. See [UBIFS-WP] for more details.
190*09f4c750SMauro Carvalho Chehab
191*09f4c750SMauro Carvalho ChehabThe LEB property tree again is a B+ tree, but it is much smaller than the
192*09f4c750SMauro Carvalho Chehabindex. Due to its smaller size it is always written as one chunk on every
193*09f4c750SMauro Carvalho Chehabcommit. Thus, saving the LPT is an atomic operation.
194*09f4c750SMauro Carvalho Chehab
195*09f4c750SMauro Carvalho Chehab
196*09f4c750SMauro Carvalho Chehab.. [1] Since LEBs can only be appended and never overwritten, there is a
197*09f4c750SMauro Carvalho Chehab   difference between free space ie. the remaining space left on the LEB to be
198*09f4c750SMauro Carvalho Chehab   written to without erasing it and previously written content that is obsolete
199*09f4c750SMauro Carvalho Chehab   but can't be overwritten without erasing the full LEB.
200*09f4c750SMauro Carvalho Chehab
201*09f4c750SMauro Carvalho Chehab
202*09f4c750SMauro Carvalho ChehabUBIFS Authentication
203*09f4c750SMauro Carvalho Chehab====================
204*09f4c750SMauro Carvalho Chehab
205*09f4c750SMauro Carvalho ChehabThis chapter introduces UBIFS authentication which enables UBIFS to verify
206*09f4c750SMauro Carvalho Chehabthe authenticity and integrity of metadata and file contents stored on flash.
207*09f4c750SMauro Carvalho Chehab
208*09f4c750SMauro Carvalho Chehab
209*09f4c750SMauro Carvalho ChehabThreat Model
210*09f4c750SMauro Carvalho Chehab------------
211*09f4c750SMauro Carvalho Chehab
212*09f4c750SMauro Carvalho ChehabUBIFS authentication enables detection of offline data modification. While it
213*09f4c750SMauro Carvalho Chehabdoes not prevent it, it enables (trusted) code to check the integrity and
214*09f4c750SMauro Carvalho Chehabauthenticity of on-flash file contents and filesystem metadata. This covers
215*09f4c750SMauro Carvalho Chehabattacks where file contents are swapped.
216*09f4c750SMauro Carvalho Chehab
217*09f4c750SMauro Carvalho ChehabUBIFS authentication will not protect against rollback of full flash contents.
218*09f4c750SMauro Carvalho ChehabIe. an attacker can still dump the flash and restore it at a later time without
219*09f4c750SMauro Carvalho Chehabdetection. It will also not protect against partial rollback of individual
220*09f4c750SMauro Carvalho Chehabindex commits. That means that an attacker is able to partially undo changes.
221*09f4c750SMauro Carvalho ChehabThis is possible because UBIFS does not immediately overwrites obsolete
222*09f4c750SMauro Carvalho Chehabversions of the index tree or the journal, but instead marks them as obsolete
223*09f4c750SMauro Carvalho Chehaband garbage collection erases them at a later time. An attacker can use this by
224*09f4c750SMauro Carvalho Chehaberasing parts of the current tree and restoring old versions that are still on
225*09f4c750SMauro Carvalho Chehabthe flash and have not yet been erased. This is possible, because every commit
226*09f4c750SMauro Carvalho Chehabwill always write a new version of the index root node and the master node
227*09f4c750SMauro Carvalho Chehabwithout overwriting the previous version. This is further helped by the
228*09f4c750SMauro Carvalho Chehabwear-leveling operations of UBI which copies contents from one physical
229*09f4c750SMauro Carvalho Chehaberaseblock to another and does not atomically erase the first eraseblock.
230*09f4c750SMauro Carvalho Chehab
231*09f4c750SMauro Carvalho ChehabUBIFS authentication does not cover attacks where an attacker is able to
232*09f4c750SMauro Carvalho Chehabexecute code on the device after the authentication key was provided.
233*09f4c750SMauro Carvalho ChehabAdditional measures like secure boot and trusted boot have to be taken to
234*09f4c750SMauro Carvalho Chehabensure that only trusted code is executed on a device.
235*09f4c750SMauro Carvalho Chehab
236*09f4c750SMauro Carvalho Chehab
237*09f4c750SMauro Carvalho ChehabAuthentication
238*09f4c750SMauro Carvalho Chehab--------------
239*09f4c750SMauro Carvalho Chehab
240*09f4c750SMauro Carvalho ChehabTo be able to fully trust data read from flash, all UBIFS data structures
241*09f4c750SMauro Carvalho Chehabstored on flash are authenticated. That is:
242*09f4c750SMauro Carvalho Chehab
243*09f4c750SMauro Carvalho Chehab- The index which includes file contents, file metadata like extended
244*09f4c750SMauro Carvalho Chehab  attributes, file length etc.
245*09f4c750SMauro Carvalho Chehab- The journal which also contains file contents and metadata by recording changes
246*09f4c750SMauro Carvalho Chehab  to the filesystem
247*09f4c750SMauro Carvalho Chehab- The LPT which stores UBI LEB metadata which UBIFS uses for free space accounting
248*09f4c750SMauro Carvalho Chehab
249*09f4c750SMauro Carvalho Chehab
250*09f4c750SMauro Carvalho ChehabIndex Authentication
251*09f4c750SMauro Carvalho Chehab~~~~~~~~~~~~~~~~~~~~
252*09f4c750SMauro Carvalho Chehab
253*09f4c750SMauro Carvalho ChehabThrough UBIFS' concept of a wandering tree, it already takes care of only
254*09f4c750SMauro Carvalho Chehabupdating and persisting changed parts from leaf node up to the root node
255*09f4c750SMauro Carvalho Chehabof the full B+ tree. This enables us to augment the index nodes of the tree
256*09f4c750SMauro Carvalho Chehabwith a hash over each node's child nodes. As a result, the index basically also
257*09f4c750SMauro Carvalho Chehaba Merkle tree. Since the leaf nodes of the index contain the actual filesystem
258*09f4c750SMauro Carvalho Chehabdata, the hashes of their parent index nodes thus cover all the file contents
259*09f4c750SMauro Carvalho Chehaband file metadata. When a file changes, the UBIFS index is updated accordingly
260*09f4c750SMauro Carvalho Chehabfrom the leaf nodes up to the root node including the master node. This process
261*09f4c750SMauro Carvalho Chehabcan be hooked to recompute the hash only for each changed node at the same time.
262*09f4c750SMauro Carvalho ChehabWhenever a file is read, UBIFS can verify the hashes from each leaf node up to
263*09f4c750SMauro Carvalho Chehabthe root node to ensure the node's integrity.
264*09f4c750SMauro Carvalho Chehab
265*09f4c750SMauro Carvalho ChehabTo ensure the authenticity of the whole index, the UBIFS master node stores a
266*09f4c750SMauro Carvalho Chehabkeyed hash (HMAC) over its own contents and a hash of the root node of the index
267*09f4c750SMauro Carvalho Chehabtree. As mentioned above, the master node is always written to the flash whenever
268*09f4c750SMauro Carvalho Chehabthe index is persisted (ie. on index commit).
269*09f4c750SMauro Carvalho Chehab
270*09f4c750SMauro Carvalho ChehabUsing this approach only UBIFS index nodes and the master node are changed to
271*09f4c750SMauro Carvalho Chehabinclude a hash. All other types of nodes will remain unchanged. This reduces
272*09f4c750SMauro Carvalho Chehabthe storage overhead which is precious for users of UBIFS (ie. embedded
273*09f4c750SMauro Carvalho Chehabdevices).
274*09f4c750SMauro Carvalho Chehab
275*09f4c750SMauro Carvalho Chehab::
276*09f4c750SMauro Carvalho Chehab
277*09f4c750SMauro Carvalho Chehab                             +---------------+
278*09f4c750SMauro Carvalho Chehab                             |  Master Node  |
279*09f4c750SMauro Carvalho Chehab                             |    (hash)     |
280*09f4c750SMauro Carvalho Chehab                             +---------------+
281*09f4c750SMauro Carvalho Chehab                                     |
282*09f4c750SMauro Carvalho Chehab                                     v
283*09f4c750SMauro Carvalho Chehab                            +-------------------+
284*09f4c750SMauro Carvalho Chehab                            |  Index Node #1    |
285*09f4c750SMauro Carvalho Chehab                            |                   |
286*09f4c750SMauro Carvalho Chehab                            | branch0   branchn |
287*09f4c750SMauro Carvalho Chehab                            | (hash)    (hash)  |
288*09f4c750SMauro Carvalho Chehab                            +-------------------+
289*09f4c750SMauro Carvalho Chehab                               |    ...   |  (fanout: 8)
290*09f4c750SMauro Carvalho Chehab                               |          |
291*09f4c750SMauro Carvalho Chehab                       +-------+          +------+
292*09f4c750SMauro Carvalho Chehab                       |                         |
293*09f4c750SMauro Carvalho Chehab                       v                         v
294*09f4c750SMauro Carvalho Chehab            +-------------------+       +-------------------+
295*09f4c750SMauro Carvalho Chehab            |  Index Node #2    |       |  Index Node #3    |
296*09f4c750SMauro Carvalho Chehab            |                   |       |                   |
297*09f4c750SMauro Carvalho Chehab            | branch0   branchn |       | branch0   branchn |
298*09f4c750SMauro Carvalho Chehab            | (hash)    (hash)  |       | (hash)    (hash)  |
299*09f4c750SMauro Carvalho Chehab            +-------------------+       +-------------------+
300*09f4c750SMauro Carvalho Chehab                 |   ...                     |   ...   |
301*09f4c750SMauro Carvalho Chehab                 v                           v         v
302*09f4c750SMauro Carvalho Chehab               +-----------+         +----------+  +-----------+
303*09f4c750SMauro Carvalho Chehab               | Data Node |         | INO Node |  | DENT Node |
304*09f4c750SMauro Carvalho Chehab               +-----------+         +----------+  +-----------+
305*09f4c750SMauro Carvalho Chehab
306*09f4c750SMauro Carvalho Chehab
307*09f4c750SMauro Carvalho Chehab           Figure 3: Coverage areas of index node hash and master node HMAC
308*09f4c750SMauro Carvalho Chehab
309*09f4c750SMauro Carvalho Chehab
310*09f4c750SMauro Carvalho Chehab
311*09f4c750SMauro Carvalho ChehabThe most important part for robustness and power-cut safety is to atomically
312*09f4c750SMauro Carvalho Chehabpersist the hash and file contents. Here the existing UBIFS logic for how
313*09f4c750SMauro Carvalho Chehabchanged nodes are persisted is already designed for this purpose such that
314*09f4c750SMauro Carvalho ChehabUBIFS can safely recover if a power-cut occurs while persisting. Adding
315*09f4c750SMauro Carvalho Chehabhashes to index nodes does not change this since each hash will be persisted
316*09f4c750SMauro Carvalho Chehabatomically together with its respective node.
317*09f4c750SMauro Carvalho Chehab
318*09f4c750SMauro Carvalho Chehab
319*09f4c750SMauro Carvalho ChehabJournal Authentication
320*09f4c750SMauro Carvalho Chehab~~~~~~~~~~~~~~~~~~~~~~
321*09f4c750SMauro Carvalho Chehab
322*09f4c750SMauro Carvalho ChehabThe journal is authenticated too. Since the journal is continuously written
323*09f4c750SMauro Carvalho Chehabit is necessary to also add authentication information frequently to the
324*09f4c750SMauro Carvalho Chehabjournal so that in case of a powercut not too much data can't be authenticated.
325*09f4c750SMauro Carvalho ChehabThis is done by creating a continuous hash beginning from the commit start node
326*09f4c750SMauro Carvalho Chehabover the previous reference nodes, the current reference node, and the bud
327*09f4c750SMauro Carvalho Chehabnodes. From time to time whenever it is suitable authentication nodes are added
328*09f4c750SMauro Carvalho Chehabbetween the bud nodes. This new node type contains a HMAC over the current state
329*09f4c750SMauro Carvalho Chehabof the hash chain. That way a journal can be authenticated up to the last
330*09f4c750SMauro Carvalho Chehabauthentication node. The tail of the journal which may not have a authentication
331*09f4c750SMauro Carvalho Chehabnode cannot be authenticated and is skipped during journal replay.
332*09f4c750SMauro Carvalho Chehab
333*09f4c750SMauro Carvalho ChehabWe get this picture for journal authentication::
334*09f4c750SMauro Carvalho Chehab
335*09f4c750SMauro Carvalho Chehab    ,,,,,,,,
336*09f4c750SMauro Carvalho Chehab    ,......,...........................................
337*09f4c750SMauro Carvalho Chehab    ,. CS  ,               hash1.----.           hash2.----.
338*09f4c750SMauro Carvalho Chehab    ,.  |  ,                    .    |hmac            .    |hmac
339*09f4c750SMauro Carvalho Chehab    ,.  v  ,                    .    v                .    v
340*09f4c750SMauro Carvalho Chehab    ,.REF#0,-> bud -> bud -> bud.-> auth -> bud -> bud.-> auth ...
341*09f4c750SMauro Carvalho Chehab    ,..|...,...........................................
342*09f4c750SMauro Carvalho Chehab    ,  |   ,
343*09f4c750SMauro Carvalho Chehab    ,  |   ,,,,,,,,,,,,,,,
344*09f4c750SMauro Carvalho Chehab    .  |            hash3,----.
345*09f4c750SMauro Carvalho Chehab    ,  |                 ,    |hmac
346*09f4c750SMauro Carvalho Chehab    ,  v                 ,    v
347*09f4c750SMauro Carvalho Chehab    , REF#1 -> bud -> bud,-> auth ...
348*09f4c750SMauro Carvalho Chehab    ,,,|,,,,,,,,,,,,,,,,,,
349*09f4c750SMauro Carvalho Chehab       v
350*09f4c750SMauro Carvalho Chehab      REF#2 -> ...
351*09f4c750SMauro Carvalho Chehab       |
352*09f4c750SMauro Carvalho Chehab       V
353*09f4c750SMauro Carvalho Chehab      ...
354*09f4c750SMauro Carvalho Chehab
355*09f4c750SMauro Carvalho ChehabSince the hash also includes the reference nodes an attacker cannot reorder or
356*09f4c750SMauro Carvalho Chehabskip any journal heads for replay. An attacker can only remove bud nodes or
357*09f4c750SMauro Carvalho Chehabreference nodes from the end of the journal, effectively rewinding the
358*09f4c750SMauro Carvalho Chehabfilesystem at maximum back to the last commit.
359*09f4c750SMauro Carvalho Chehab
360*09f4c750SMauro Carvalho ChehabThe location of the log area is stored in the master node. Since the master
361*09f4c750SMauro Carvalho Chehabnode is authenticated with a HMAC as described above, it is not possible to
362*09f4c750SMauro Carvalho Chehabtamper with that without detection. The size of the log area is specified when
363*09f4c750SMauro Carvalho Chehabthe filesystem is created using `mkfs.ubifs` and stored in the superblock node.
364*09f4c750SMauro Carvalho ChehabTo avoid tampering with this and other values stored there, a HMAC is added to
365*09f4c750SMauro Carvalho Chehabthe superblock struct. The superblock node is stored in LEB 0 and is only
366*09f4c750SMauro Carvalho Chehabmodified on feature flag or similar changes, but never on file changes.
367*09f4c750SMauro Carvalho Chehab
368*09f4c750SMauro Carvalho Chehab
369*09f4c750SMauro Carvalho ChehabLPT Authentication
370*09f4c750SMauro Carvalho Chehab~~~~~~~~~~~~~~~~~~
371*09f4c750SMauro Carvalho Chehab
372*09f4c750SMauro Carvalho ChehabThe location of the LPT root node on the flash is stored in the UBIFS master
373*09f4c750SMauro Carvalho Chehabnode. Since the LPT is written and read atomically on every commit, there is
374*09f4c750SMauro Carvalho Chehabno need to authenticate individual nodes of the tree. It suffices to
375*09f4c750SMauro Carvalho Chehabprotect the integrity of the full LPT by a simple hash stored in the master
376*09f4c750SMauro Carvalho Chehabnode. Since the master node itself is authenticated, the LPTs authenticity can
377*09f4c750SMauro Carvalho Chehabbe verified by verifying the authenticity of the master node and comparing the
378*09f4c750SMauro Carvalho ChehabLTP hash stored there with the hash computed from the read on-flash LPT.
379*09f4c750SMauro Carvalho Chehab
380*09f4c750SMauro Carvalho Chehab
381*09f4c750SMauro Carvalho ChehabKey Management
382*09f4c750SMauro Carvalho Chehab--------------
383*09f4c750SMauro Carvalho Chehab
384*09f4c750SMauro Carvalho ChehabFor simplicity, UBIFS authentication uses a single key to compute the HMACs
385*09f4c750SMauro Carvalho Chehabof superblock, master, commit start and reference nodes. This key has to be
386*09f4c750SMauro Carvalho Chehabavailable on creation of the filesystem (`mkfs.ubifs`) to authenticate the
387*09f4c750SMauro Carvalho Chehabsuperblock node. Further, it has to be available on mount of the filesystem
388*09f4c750SMauro Carvalho Chehabto verify authenticated nodes and generate new HMACs for changes.
389*09f4c750SMauro Carvalho Chehab
390*09f4c750SMauro Carvalho ChehabUBIFS authentication is intended to operate side-by-side with UBIFS encryption
391*09f4c750SMauro Carvalho Chehab(fscrypt) to provide confidentiality and authenticity. Since UBIFS encryption
392*09f4c750SMauro Carvalho Chehabhas a different approach of encryption policies per directory, there can be
393*09f4c750SMauro Carvalho Chehabmultiple fscrypt master keys and there might be folders without encryption.
394*09f4c750SMauro Carvalho ChehabUBIFS authentication on the other hand has an all-or-nothing approach in the
395*09f4c750SMauro Carvalho Chehabsense that it either authenticates everything of the filesystem or nothing.
396*09f4c750SMauro Carvalho ChehabBecause of this and because UBIFS authentication should also be usable without
397*09f4c750SMauro Carvalho Chehabencryption, it does not share the same master key with fscrypt, but manages
398*09f4c750SMauro Carvalho Chehaba dedicated authentication key.
399*09f4c750SMauro Carvalho Chehab
400*09f4c750SMauro Carvalho ChehabThe API for providing the authentication key has yet to be defined, but the
401*09f4c750SMauro Carvalho Chehabkey can eg. be provided by userspace through a keyring similar to the way it
402*09f4c750SMauro Carvalho Chehabis currently done in fscrypt. It should however be noted that the current
403*09f4c750SMauro Carvalho Chehabfscrypt approach has shown its flaws and the userspace API will eventually
404*09f4c750SMauro Carvalho Chehabchange [FSCRYPT-POLICY2].
405*09f4c750SMauro Carvalho Chehab
406*09f4c750SMauro Carvalho ChehabNevertheless, it will be possible for a user to provide a single passphrase
407*09f4c750SMauro Carvalho Chehabor key in userspace that covers UBIFS authentication and encryption. This can
408*09f4c750SMauro Carvalho Chehabbe solved by the corresponding userspace tools which derive a second key for
409*09f4c750SMauro Carvalho Chehabauthentication in addition to the derived fscrypt master key used for
410*09f4c750SMauro Carvalho Chehabencryption.
411*09f4c750SMauro Carvalho Chehab
412*09f4c750SMauro Carvalho ChehabTo be able to check if the proper key is available on mount, the UBIFS
413*09f4c750SMauro Carvalho Chehabsuperblock node will additionally store a hash of the authentication key. This
414*09f4c750SMauro Carvalho Chehabapproach is similar to the approach proposed for fscrypt encryption policy v2
415*09f4c750SMauro Carvalho Chehab[FSCRYPT-POLICY2].
416*09f4c750SMauro Carvalho Chehab
417*09f4c750SMauro Carvalho Chehab
418*09f4c750SMauro Carvalho ChehabFuture Extensions
419*09f4c750SMauro Carvalho Chehab=================
420*09f4c750SMauro Carvalho Chehab
421*09f4c750SMauro Carvalho ChehabIn certain cases where a vendor wants to provide an authenticated filesystem
422*09f4c750SMauro Carvalho Chehabimage to customers, it should be possible to do so without sharing the secret
423*09f4c750SMauro Carvalho ChehabUBIFS authentication key. Instead, in addition the each HMAC a digital
424*09f4c750SMauro Carvalho Chehabsignature could be stored where the vendor shares the public key alongside the
425*09f4c750SMauro Carvalho Chehabfilesystem image. In case this filesystem has to be modified afterwards,
426*09f4c750SMauro Carvalho ChehabUBIFS can exchange all digital signatures with HMACs on first mount similar
427*09f4c750SMauro Carvalho Chehabto the way the IMA/EVM subsystem deals with such situations. The HMAC key
428*09f4c750SMauro Carvalho Chehabwill then have to be provided beforehand in the normal way.
429*09f4c750SMauro Carvalho Chehab
430*09f4c750SMauro Carvalho Chehab
431*09f4c750SMauro Carvalho ChehabReferences
432*09f4c750SMauro Carvalho Chehab==========
433*09f4c750SMauro Carvalho Chehab
434*09f4c750SMauro Carvalho Chehab[CRYPTSETUP2]        http://www.saout.de/pipermail/dm-crypt/2017-November/005745.html
435*09f4c750SMauro Carvalho Chehab
436*09f4c750SMauro Carvalho Chehab[DMC-CBC-ATTACK]     http://www.jakoblell.com/blog/2013/12/22/practical-malleability-attack-against-cbc-encrypted-luks-partitions/
437*09f4c750SMauro Carvalho Chehab
438*09f4c750SMauro Carvalho Chehab[DM-INTEGRITY]       https://www.kernel.org/doc/Documentation/device-mapper/dm-integrity.rst
439*09f4c750SMauro Carvalho Chehab
440*09f4c750SMauro Carvalho Chehab[DM-VERITY]          https://www.kernel.org/doc/Documentation/device-mapper/verity.rst
441*09f4c750SMauro Carvalho Chehab
442*09f4c750SMauro Carvalho Chehab[FSCRYPT-POLICY2]    https://www.spinics.net/lists/linux-ext4/msg58710.html
443*09f4c750SMauro Carvalho Chehab
444*09f4c750SMauro Carvalho Chehab[UBIFS-WP]           http://www.linux-mtd.infradead.org/doc/ubifs_whitepaper.pdf
445