1e66d8631SMauro Carvalho Chehab.. SPDX-License-Identifier: GPL-2.0 2e66d8631SMauro Carvalho Chehab 3e66d8631SMauro Carvalho Chehab====================================== 4e66d8631SMauro Carvalho ChehabEnhanced Read-Only File System - EROFS 5e66d8631SMauro Carvalho Chehab====================================== 6e66d8631SMauro Carvalho Chehab 7e66d8631SMauro Carvalho ChehabOverview 8e66d8631SMauro Carvalho Chehab======== 9e66d8631SMauro Carvalho Chehab 10e66d8631SMauro Carvalho ChehabEROFS file-system stands for Enhanced Read-Only File System. Different 11e66d8631SMauro Carvalho Chehabfrom other read-only file systems, it aims to be designed for flexibility, 12e66d8631SMauro Carvalho Chehabscalability, but be kept simple and high performance. 13e66d8631SMauro Carvalho Chehab 14e66d8631SMauro Carvalho ChehabIt is designed as a better filesystem solution for the following scenarios: 15e66d8631SMauro Carvalho Chehab 16e66d8631SMauro Carvalho Chehab - read-only storage media or 17e66d8631SMauro Carvalho Chehab 18e66d8631SMauro Carvalho Chehab - part of a fully trusted read-only solution, which means it needs to be 19e66d8631SMauro Carvalho Chehab immutable and bit-for-bit identical to the official golden image for 20e66d8631SMauro Carvalho Chehab their releases due to security and other considerations and 21e66d8631SMauro Carvalho Chehab 22e66d8631SMauro Carvalho Chehab - hope to save some extra storage space with guaranteed end-to-end performance 23e66d8631SMauro Carvalho Chehab by using reduced metadata and transparent file compression, especially 24e66d8631SMauro Carvalho Chehab for those embedded devices with limited memory (ex, smartphone); 25e66d8631SMauro Carvalho Chehab 26e66d8631SMauro Carvalho ChehabHere is the main features of EROFS: 27e66d8631SMauro Carvalho Chehab 28e66d8631SMauro Carvalho Chehab - Little endian on-disk design; 29e66d8631SMauro Carvalho Chehab 30e66d8631SMauro Carvalho Chehab - Currently 4KB block size (nobh) and therefore maximum 16TB address space; 31e66d8631SMauro Carvalho Chehab 32e66d8631SMauro Carvalho Chehab - Metadata & data could be mixed by design; 33e66d8631SMauro Carvalho Chehab 34e66d8631SMauro Carvalho Chehab - 2 inode versions for different requirements: 35e66d8631SMauro Carvalho Chehab 36e66d8631SMauro Carvalho Chehab ===================== ============ ===================================== 37e66d8631SMauro Carvalho Chehab compact (v1) extended (v2) 38e66d8631SMauro Carvalho Chehab ===================== ============ ===================================== 39e66d8631SMauro Carvalho Chehab Inode metadata size 32 bytes 64 bytes 40e66d8631SMauro Carvalho Chehab Max file size 4 GB 16 EB (also limited by max. vol size) 41e66d8631SMauro Carvalho Chehab Max uids/gids 65536 4294967296 42e66d8631SMauro Carvalho Chehab File change time no yes (64 + 32-bit timestamp) 43e66d8631SMauro Carvalho Chehab Max hardlinks 65536 4294967296 44e66d8631SMauro Carvalho Chehab Metadata reserved 4 bytes 14 bytes 45e66d8631SMauro Carvalho Chehab ===================== ============ ===================================== 46e66d8631SMauro Carvalho Chehab 47e66d8631SMauro Carvalho Chehab - Support extended attributes (xattrs) as an option; 48e66d8631SMauro Carvalho Chehab 49e66d8631SMauro Carvalho Chehab - Support xattr inline and tail-end data inline for all files; 50e66d8631SMauro Carvalho Chehab 51e66d8631SMauro Carvalho Chehab - Support POSIX.1e ACLs by using xattrs; 52e66d8631SMauro Carvalho Chehab 53e66d8631SMauro Carvalho Chehab - Support transparent file compression as an option: 54e66d8631SMauro Carvalho Chehab LZ4 algorithm with 4 KB fixed-sized output compression for high performance. 55e66d8631SMauro Carvalho Chehab 56e66d8631SMauro Carvalho ChehabThe following git tree provides the file system user-space tools under 57e66d8631SMauro Carvalho Chehabdevelopment (ex, formatting tool mkfs.erofs): 58e66d8631SMauro Carvalho Chehab 59e66d8631SMauro Carvalho Chehab- git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs-utils.git 60e66d8631SMauro Carvalho Chehab 61e66d8631SMauro Carvalho ChehabBugs and patches are welcome, please kindly help us and send to the following 62e66d8631SMauro Carvalho Chehablinux-erofs mailing list: 63e66d8631SMauro Carvalho Chehab 64e66d8631SMauro Carvalho Chehab- linux-erofs mailing list <linux-erofs@lists.ozlabs.org> 65e66d8631SMauro Carvalho Chehab 66e66d8631SMauro Carvalho ChehabMount options 67e66d8631SMauro Carvalho Chehab============= 68e66d8631SMauro Carvalho Chehab 69e66d8631SMauro Carvalho Chehab=================== ========================================================= 70e66d8631SMauro Carvalho Chehab(no)user_xattr Setup Extended User Attributes. Note: xattr is enabled 71e66d8631SMauro Carvalho Chehab by default if CONFIG_EROFS_FS_XATTR is selected. 72e66d8631SMauro Carvalho Chehab(no)acl Setup POSIX Access Control List. Note: acl is enabled 73e66d8631SMauro Carvalho Chehab by default if CONFIG_EROFS_FS_POSIX_ACL is selected. 74e66d8631SMauro Carvalho Chehabcache_strategy=%s Select a strategy for cached decompression from now on: 75e66d8631SMauro Carvalho Chehab 76e66d8631SMauro Carvalho Chehab ========== ============================================= 77e66d8631SMauro Carvalho Chehab disabled In-place I/O decompression only; 78e66d8631SMauro Carvalho Chehab readahead Cache the last incomplete compressed physical 79e66d8631SMauro Carvalho Chehab cluster for further reading. It still does 80e66d8631SMauro Carvalho Chehab in-place I/O decompression for the rest 81e66d8631SMauro Carvalho Chehab compressed physical clusters; 82e66d8631SMauro Carvalho Chehab readaround Cache the both ends of incomplete compressed 83e66d8631SMauro Carvalho Chehab physical clusters for further reading. 84e66d8631SMauro Carvalho Chehab It still does in-place I/O decompression 85e66d8631SMauro Carvalho Chehab for the rest compressed physical clusters. 86e66d8631SMauro Carvalho Chehab ========== ============================================= 87e66d8631SMauro Carvalho Chehab=================== ========================================================= 88e66d8631SMauro Carvalho Chehab 89e66d8631SMauro Carvalho ChehabOn-disk details 90e66d8631SMauro Carvalho Chehab=============== 91e66d8631SMauro Carvalho Chehab 92e66d8631SMauro Carvalho ChehabSummary 93e66d8631SMauro Carvalho Chehab------- 94e66d8631SMauro Carvalho ChehabDifferent from other read-only file systems, an EROFS volume is designed 95e66d8631SMauro Carvalho Chehabto be as simple as possible:: 96e66d8631SMauro Carvalho Chehab 97e66d8631SMauro Carvalho Chehab |-> aligned with the block size 98e66d8631SMauro Carvalho Chehab ____________________________________________________________ 99e66d8631SMauro Carvalho Chehab | |SB| | ... | Metadata | ... | Data | Metadata | ... | Data | 100e66d8631SMauro Carvalho Chehab |_|__|_|_____|__________|_____|______|__________|_____|______| 101e66d8631SMauro Carvalho Chehab 0 +1K 102e66d8631SMauro Carvalho Chehab 103e66d8631SMauro Carvalho ChehabAll data areas should be aligned with the block size, but metadata areas 104e66d8631SMauro Carvalho Chehabmay not. All metadatas can be now observed in two different spaces (views): 105e66d8631SMauro Carvalho Chehab 106e66d8631SMauro Carvalho Chehab 1. Inode metadata space 107e66d8631SMauro Carvalho Chehab 108e66d8631SMauro Carvalho Chehab Each valid inode should be aligned with an inode slot, which is a fixed 109e66d8631SMauro Carvalho Chehab value (32 bytes) and designed to be kept in line with compact inode size. 110e66d8631SMauro Carvalho Chehab 111e66d8631SMauro Carvalho Chehab Each inode can be directly found with the following formula: 112e66d8631SMauro Carvalho Chehab inode offset = meta_blkaddr * block_size + 32 * nid 113e66d8631SMauro Carvalho Chehab 114e66d8631SMauro Carvalho Chehab :: 115e66d8631SMauro Carvalho Chehab 116e66d8631SMauro Carvalho Chehab |-> aligned with 8B 117e66d8631SMauro Carvalho Chehab |-> followed closely 118e66d8631SMauro Carvalho Chehab + meta_blkaddr blocks |-> another slot 119e66d8631SMauro Carvalho Chehab _____________________________________________________________________ 120e66d8631SMauro Carvalho Chehab | ... | inode | xattrs | extents | data inline | ... | inode ... 121e66d8631SMauro Carvalho Chehab |________|_______|(optional)|(optional)|__(optional)_|_____|__________ 122e66d8631SMauro Carvalho Chehab |-> aligned with the inode slot size 123e66d8631SMauro Carvalho Chehab . . 124e66d8631SMauro Carvalho Chehab . . 125e66d8631SMauro Carvalho Chehab . . 126e66d8631SMauro Carvalho Chehab . . 127e66d8631SMauro Carvalho Chehab . . 128e66d8631SMauro Carvalho Chehab . . 129e66d8631SMauro Carvalho Chehab .____________________________________________________|-> aligned with 4B 130e66d8631SMauro Carvalho Chehab | xattr_ibody_header | shared xattrs | inline xattrs | 131e66d8631SMauro Carvalho Chehab |____________________|_______________|_______________| 132e66d8631SMauro Carvalho Chehab |-> 12 bytes <-|->x * 4 bytes<-| . 133e66d8631SMauro Carvalho Chehab . . . 134e66d8631SMauro Carvalho Chehab . . . 135e66d8631SMauro Carvalho Chehab . . . 136e66d8631SMauro Carvalho Chehab ._______________________________.______________________. 137e66d8631SMauro Carvalho Chehab | id | id | id | id | ... | id | ent | ... | ent| ... | 138e66d8631SMauro Carvalho Chehab |____|____|____|____|______|____|_____|_____|____|_____| 139e66d8631SMauro Carvalho Chehab |-> aligned with 4B 140e66d8631SMauro Carvalho Chehab |-> aligned with 4B 141e66d8631SMauro Carvalho Chehab 142e66d8631SMauro Carvalho Chehab Inode could be 32 or 64 bytes, which can be distinguished from a common 143e66d8631SMauro Carvalho Chehab field which all inode versions have -- i_format:: 144e66d8631SMauro Carvalho Chehab 145e66d8631SMauro Carvalho Chehab __________________ __________________ 146e66d8631SMauro Carvalho Chehab | i_format | | i_format | 147e66d8631SMauro Carvalho Chehab |__________________| |__________________| 148e66d8631SMauro Carvalho Chehab | ... | | ... | 149e66d8631SMauro Carvalho Chehab | | | | 150e66d8631SMauro Carvalho Chehab |__________________| 32 bytes | | 151e66d8631SMauro Carvalho Chehab | | 152e66d8631SMauro Carvalho Chehab |__________________| 64 bytes 153e66d8631SMauro Carvalho Chehab 154e66d8631SMauro Carvalho Chehab Xattrs, extents, data inline are followed by the corresponding inode with 155e66d8631SMauro Carvalho Chehab proper alignment, and they could be optional for different data mappings. 156e66d8631SMauro Carvalho Chehab _currently_ total 4 valid data mappings are supported: 157e66d8631SMauro Carvalho Chehab 158e66d8631SMauro Carvalho Chehab == ==================================================================== 159e66d8631SMauro Carvalho Chehab 0 flat file data without data inline (no extent); 160e66d8631SMauro Carvalho Chehab 1 fixed-sized output data compression (with non-compacted indexes); 161e66d8631SMauro Carvalho Chehab 2 flat file data with tail packing data inline (no extent); 162e66d8631SMauro Carvalho Chehab 3 fixed-sized output data compression (with compacted indexes, v5.3+). 163e66d8631SMauro Carvalho Chehab == ==================================================================== 164e66d8631SMauro Carvalho Chehab 165e66d8631SMauro Carvalho Chehab The size of the optional xattrs is indicated by i_xattr_count in inode 166e66d8631SMauro Carvalho Chehab header. Large xattrs or xattrs shared by many different files can be 167e66d8631SMauro Carvalho Chehab stored in shared xattrs metadata rather than inlined right after inode. 168e66d8631SMauro Carvalho Chehab 169e66d8631SMauro Carvalho Chehab 2. Shared xattrs metadata space 170e66d8631SMauro Carvalho Chehab 171e66d8631SMauro Carvalho Chehab Shared xattrs space is similar to the above inode space, started with 172e66d8631SMauro Carvalho Chehab a specific block indicated by xattr_blkaddr, organized one by one with 173e66d8631SMauro Carvalho Chehab proper align. 174e66d8631SMauro Carvalho Chehab 175e66d8631SMauro Carvalho Chehab Each share xattr can also be directly found by the following formula: 176e66d8631SMauro Carvalho Chehab xattr offset = xattr_blkaddr * block_size + 4 * xattr_id 177e66d8631SMauro Carvalho Chehab 178e66d8631SMauro Carvalho Chehab :: 179e66d8631SMauro Carvalho Chehab 180e66d8631SMauro Carvalho Chehab |-> aligned by 4 bytes 181e66d8631SMauro Carvalho Chehab + xattr_blkaddr blocks |-> aligned with 4 bytes 182e66d8631SMauro Carvalho Chehab _________________________________________________________________________ 183e66d8631SMauro Carvalho Chehab | ... | xattr_entry | xattr data | ... | xattr_entry | xattr data ... 184e66d8631SMauro Carvalho Chehab |________|_____________|_____________|_____|______________|_______________ 185e66d8631SMauro Carvalho Chehab 186e66d8631SMauro Carvalho ChehabDirectories 187e66d8631SMauro Carvalho Chehab----------- 188e66d8631SMauro Carvalho ChehabAll directories are now organized in a compact on-disk format. Note that 189e66d8631SMauro Carvalho Chehabeach directory block is divided into index and name areas in order to support 190e66d8631SMauro Carvalho Chehabrandom file lookup, and all directory entries are _strictly_ recorded in 191e66d8631SMauro Carvalho Chehabalphabetical order in order to support improved prefix binary search 192e66d8631SMauro Carvalho Chehabalgorithm (could refer to the related source code). 193e66d8631SMauro Carvalho Chehab 194e66d8631SMauro Carvalho Chehab:: 195e66d8631SMauro Carvalho Chehab 196e66d8631SMauro Carvalho Chehab ___________________________ 197e66d8631SMauro Carvalho Chehab / | 198e66d8631SMauro Carvalho Chehab / ______________|________________ 199e66d8631SMauro Carvalho Chehab / / | nameoff1 | nameoffN-1 200e66d8631SMauro Carvalho Chehab ____________.______________._______________v________________v__________ 201e66d8631SMauro Carvalho Chehab | dirent | dirent | ... | dirent | filename | filename | ... | filename | 202e66d8631SMauro Carvalho Chehab |___.0___|____1___|_____|___N-1__|____0_____|____1_____|_____|___N-1____| 203e66d8631SMauro Carvalho Chehab \ ^ 204e66d8631SMauro Carvalho Chehab \ | * could have 205e66d8631SMauro Carvalho Chehab \ | trailing '\0' 206e66d8631SMauro Carvalho Chehab \________________________| nameoff0 207e66d8631SMauro Carvalho Chehab 208e66d8631SMauro Carvalho Chehab Directory block 209e66d8631SMauro Carvalho Chehab 210e66d8631SMauro Carvalho ChehabNote that apart from the offset of the first filename, nameoff0 also indicates 211e66d8631SMauro Carvalho Chehabthe total number of directory entries in this block since it is no need to 212e66d8631SMauro Carvalho Chehabintroduce another on-disk field at all. 213e66d8631SMauro Carvalho Chehab 214e66d8631SMauro Carvalho ChehabCompression 215e66d8631SMauro Carvalho Chehab----------- 216e66d8631SMauro Carvalho ChehabCurrently, EROFS supports 4KB fixed-sized output transparent file compression, 217e66d8631SMauro Carvalho Chehabas illustrated below:: 218e66d8631SMauro Carvalho Chehab 219e66d8631SMauro Carvalho Chehab |---- Variant-Length Extent ----|-------- VLE --------|----- VLE ----- 220e66d8631SMauro Carvalho Chehab clusterofs clusterofs clusterofs 221e66d8631SMauro Carvalho Chehab | | | logical data 222e66d8631SMauro Carvalho Chehab _________v_______________________________v_____________________v_______________ 223e66d8631SMauro Carvalho Chehab ... | . | | . | | . | ... 224e66d8631SMauro Carvalho Chehab ____|____.________|_____________|________.____|_____________|__.__________|____ 225e66d8631SMauro Carvalho Chehab |-> cluster <-|-> cluster <-|-> cluster <-|-> cluster <-|-> cluster <-| 226e66d8631SMauro Carvalho Chehab size size size size size 227e66d8631SMauro Carvalho Chehab . . . . 228e66d8631SMauro Carvalho Chehab . . . . 229e66d8631SMauro Carvalho Chehab . . . . 230e66d8631SMauro Carvalho Chehab _______._____________._____________._____________._____________________ 231e66d8631SMauro Carvalho Chehab ... | | | | ... physical data 232e66d8631SMauro Carvalho Chehab _______|_____________|_____________|_____________|_____________________ 233e66d8631SMauro Carvalho Chehab |-> cluster <-|-> cluster <-|-> cluster <-| 234e66d8631SMauro Carvalho Chehab size size size 235e66d8631SMauro Carvalho Chehab 236e66d8631SMauro Carvalho ChehabCurrently each on-disk physical cluster can contain 4KB (un)compressed data 237e66d8631SMauro Carvalho Chehabat most. For each logical cluster, there is a corresponding on-disk index to 238e66d8631SMauro Carvalho Chehabdescribe its cluster type, physical cluster address, etc. 239e66d8631SMauro Carvalho Chehab 240e66d8631SMauro Carvalho ChehabSee "struct z_erofs_vle_decompressed_index" in erofs_fs.h for more details. 241