xref: /openbmc/linux/Documentation/filesystems/ext4/ifork.rst (revision fcbd8037f7df694aa7bfb7ce82c0c7f5e53e7b7b)
1.. SPDX-License-Identifier: GPL-2.0
2
3The Contents of inode.i\_block
4------------------------------
5
6Depending on the type of file an inode describes, the 60 bytes of
7storage in ``inode.i_block`` can be used in different ways. In general,
8regular files and directories will use it for file block indexing
9information, and special files will use it for special purposes.
10
11Symbolic Links
12~~~~~~~~~~~~~~
13
14The target of a symbolic link will be stored in this field if the target
15string is less than 60 bytes long. Otherwise, either extents or block
16maps will be used to allocate data blocks to store the link target.
17
18Direct/Indirect Block Addressing
19~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
20
21In ext2/3, file block numbers were mapped to logical block numbers by
22means of an (up to) three level 1-1 block map. To find the logical block
23that stores a particular file block, the code would navigate through
24this increasingly complicated structure. Notice that there is neither a
25magic number nor a checksum to provide any level of confidence that the
26block isn't full of garbage.
27
28.. ifconfig:: builder != 'latex'
29
30   .. include:: blockmap.rst
31
32.. ifconfig:: builder == 'latex'
33
34   [Table omitted because LaTeX doesn't support nested tables.]
35
36Note that with this block mapping scheme, it is necessary to fill out a
37lot of mapping data even for a large contiguous file! This inefficiency
38led to the creation of the extent mapping scheme, discussed below.
39
40Notice also that a file using this mapping scheme cannot be placed
41higher than 2^32 blocks.
42
43Extent Tree
44~~~~~~~~~~~
45
46In ext4, the file to logical block map has been replaced with an extent
47tree. Under the old scheme, allocating a contiguous run of 1,000 blocks
48requires an indirect block to map all 1,000 entries; with extents, the
49mapping is reduced to a single ``struct ext4_extent`` with
50``ee_len = 1000``. If flex\_bg is enabled, it is possible to allocate
51very large files with a single extent, at a considerable reduction in
52metadata block use, and some improvement in disk efficiency. The inode
53must have the extents flag (0x80000) flag set for this feature to be in
54use.
55
56Extents are arranged as a tree. Each node of the tree begins with a
57``struct ext4_extent_header``. If the node is an interior node
58(``eh.eh_depth`` > 0), the header is followed by ``eh.eh_entries``
59instances of ``struct ext4_extent_idx``; each of these index entries
60points to a block containing more nodes in the extent tree. If the node
61is a leaf node (``eh.eh_depth == 0``), then the header is followed by
62``eh.eh_entries`` instances of ``struct ext4_extent``; these instances
63point to the file's data blocks. The root node of the extent tree is
64stored in ``inode.i_block``, which allows for the first four extents to
65be recorded without the use of extra metadata blocks.
66
67The extent tree header is recorded in ``struct ext4_extent_header``,
68which is 12 bytes long:
69
70.. list-table::
71   :widths: 8 8 24 40
72   :header-rows: 1
73
74   * - Offset
75     - Size
76     - Name
77     - Description
78   * - 0x0
79     - \_\_le16
80     - eh\_magic
81     - Magic number, 0xF30A.
82   * - 0x2
83     - \_\_le16
84     - eh\_entries
85     - Number of valid entries following the header.
86   * - 0x4
87     - \_\_le16
88     - eh\_max
89     - Maximum number of entries that could follow the header.
90   * - 0x6
91     - \_\_le16
92     - eh\_depth
93     - Depth of this extent node in the extent tree. 0 = this extent node
94       points to data blocks; otherwise, this extent node points to other
95       extent nodes. The extent tree can be at most 5 levels deep: a logical
96       block number can be at most ``2^32``, and the smallest ``n`` that
97       satisfies ``4*(((blocksize - 12)/12)^n) >= 2^32`` is 5.
98   * - 0x8
99     - \_\_le32
100     - eh\_generation
101     - Generation of the tree. (Used by Lustre, but not standard ext4).
102
103Internal nodes of the extent tree, also known as index nodes, are
104recorded as ``struct ext4_extent_idx``, and are 12 bytes long:
105
106.. list-table::
107   :widths: 8 8 24 40
108   :header-rows: 1
109
110   * - Offset
111     - Size
112     - Name
113     - Description
114   * - 0x0
115     - \_\_le32
116     - ei\_block
117     - This index node covers file blocks from 'block' onward.
118   * - 0x4
119     - \_\_le32
120     - ei\_leaf\_lo
121     - Lower 32-bits of the block number of the extent node that is the next
122       level lower in the tree. The tree node pointed to can be either another
123       internal node or a leaf node, described below.
124   * - 0x8
125     - \_\_le16
126     - ei\_leaf\_hi
127     - Upper 16-bits of the previous field.
128   * - 0xA
129     - \_\_u16
130     - ei\_unused
131     -
132
133Leaf nodes of the extent tree are recorded as ``struct ext4_extent``,
134and are also 12 bytes long:
135
136.. list-table::
137   :widths: 8 8 24 40
138   :header-rows: 1
139
140   * - Offset
141     - Size
142     - Name
143     - Description
144   * - 0x0
145     - \_\_le32
146     - ee\_block
147     - First file block number that this extent covers.
148   * - 0x4
149     - \_\_le16
150     - ee\_len
151     - Number of blocks covered by extent. If the value of this field is <=
152       32768, the extent is initialized. If the value of the field is > 32768,
153       the extent is uninitialized and the actual extent length is ``ee_len`` -
154       32768. Therefore, the maximum length of a initialized extent is 32768
155       blocks, and the maximum length of an uninitialized extent is 32767.
156   * - 0x6
157     - \_\_le16
158     - ee\_start\_hi
159     - Upper 16-bits of the block number to which this extent points.
160   * - 0x8
161     - \_\_le32
162     - ee\_start\_lo
163     - Lower 32-bits of the block number to which this extent points.
164
165Prior to the introduction of metadata checksums, the extent header +
166extent entries always left at least 4 bytes of unallocated space at the
167end of each extent tree data block (because (2^x % 12) >= 4). Therefore,
168the 32-bit checksum is inserted into this space. The 4 extents in the
169inode do not need checksumming, since the inode is already checksummed.
170The checksum is calculated against the FS UUID, the inode number, the
171inode generation, and the entire extent block leading up to (but not
172including) the checksum itself.
173
174``struct ext4_extent_tail`` is 4 bytes long:
175
176.. list-table::
177   :widths: 8 8 24 40
178   :header-rows: 1
179
180   * - Offset
181     - Size
182     - Name
183     - Description
184   * - 0x0
185     - \_\_le32
186     - eb\_checksum
187     - Checksum of the extent block, crc32c(uuid+inum+igeneration+extentblock)
188
189Inline Data
190~~~~~~~~~~~
191
192If the inline data feature is enabled for the filesystem and the flag is
193set for the inode, it is possible that the first 60 bytes of the file
194data are stored here.
195