18979fc9aSMauro Carvalho Chehab.. SPDX-License-Identifier: GPL-2.0 28979fc9aSMauro Carvalho Chehab 38979fc9aSMauro Carvalho Chehab=========================== 48979fc9aSMauro Carvalho ChehabRamfs, rootfs and initramfs 58979fc9aSMauro Carvalho Chehab=========================== 68979fc9aSMauro Carvalho Chehab 78979fc9aSMauro Carvalho ChehabOctober 17, 2005 88979fc9aSMauro Carvalho Chehab 9*bd415b5cSRandy Dunlap:Author: Rob Landley <rob@landley.net> 108979fc9aSMauro Carvalho Chehab 118979fc9aSMauro Carvalho ChehabWhat is ramfs? 128979fc9aSMauro Carvalho Chehab-------------- 138979fc9aSMauro Carvalho Chehab 148979fc9aSMauro Carvalho ChehabRamfs is a very simple filesystem that exports Linux's disk caching 158979fc9aSMauro Carvalho Chehabmechanisms (the page cache and dentry cache) as a dynamically resizable 168979fc9aSMauro Carvalho ChehabRAM-based filesystem. 178979fc9aSMauro Carvalho Chehab 188979fc9aSMauro Carvalho ChehabNormally all files are cached in memory by Linux. Pages of data read from 198979fc9aSMauro Carvalho Chehabbacking store (usually the block device the filesystem is mounted on) are kept 208979fc9aSMauro Carvalho Chehabaround in case it's needed again, but marked as clean (freeable) in case the 218979fc9aSMauro Carvalho ChehabVirtual Memory system needs the memory for something else. Similarly, data 228979fc9aSMauro Carvalho Chehabwritten to files is marked clean as soon as it has been written to backing 238979fc9aSMauro Carvalho Chehabstore, but kept around for caching purposes until the VM reallocates the 248979fc9aSMauro Carvalho Chehabmemory. A similar mechanism (the dentry cache) greatly speeds up access to 258979fc9aSMauro Carvalho Chehabdirectories. 268979fc9aSMauro Carvalho Chehab 278979fc9aSMauro Carvalho ChehabWith ramfs, there is no backing store. Files written into ramfs allocate 288979fc9aSMauro Carvalho Chehabdentries and page cache as usual, but there's nowhere to write them to. 298979fc9aSMauro Carvalho ChehabThis means the pages are never marked clean, so they can't be freed by the 308979fc9aSMauro Carvalho ChehabVM when it's looking to recycle memory. 318979fc9aSMauro Carvalho Chehab 328979fc9aSMauro Carvalho ChehabThe amount of code required to implement ramfs is tiny, because all the 338979fc9aSMauro Carvalho Chehabwork is done by the existing Linux caching infrastructure. Basically, 348979fc9aSMauro Carvalho Chehabyou're mounting the disk cache as a filesystem. Because of this, ramfs is not 358979fc9aSMauro Carvalho Chehaban optional component removable via menuconfig, since there would be negligible 368979fc9aSMauro Carvalho Chehabspace savings. 378979fc9aSMauro Carvalho Chehab 388979fc9aSMauro Carvalho Chehabramfs and ramdisk: 398979fc9aSMauro Carvalho Chehab------------------ 408979fc9aSMauro Carvalho Chehab 418979fc9aSMauro Carvalho ChehabThe older "ram disk" mechanism created a synthetic block device out of 428979fc9aSMauro Carvalho Chehaban area of RAM and used it as backing store for a filesystem. This block 438979fc9aSMauro Carvalho Chehabdevice was of fixed size, so the filesystem mounted on it was of fixed 448979fc9aSMauro Carvalho Chehabsize. Using a ram disk also required unnecessarily copying memory from the 458979fc9aSMauro Carvalho Chehabfake block device into the page cache (and copying changes back out), as well 468979fc9aSMauro Carvalho Chehabas creating and destroying dentries. Plus it needed a filesystem driver 478979fc9aSMauro Carvalho Chehab(such as ext2) to format and interpret this data. 488979fc9aSMauro Carvalho Chehab 498979fc9aSMauro Carvalho ChehabCompared to ramfs, this wastes memory (and memory bus bandwidth), creates 508979fc9aSMauro Carvalho Chehabunnecessary work for the CPU, and pollutes the CPU caches. (There are tricks 518979fc9aSMauro Carvalho Chehabto avoid this copying by playing with the page tables, but they're unpleasantly 528979fc9aSMauro Carvalho Chehabcomplicated and turn out to be about as expensive as the copying anyway.) 538979fc9aSMauro Carvalho ChehabMore to the point, all the work ramfs is doing has to happen _anyway_, 548979fc9aSMauro Carvalho Chehabsince all file access goes through the page and dentry caches. The RAM 558979fc9aSMauro Carvalho Chehabdisk is simply unnecessary; ramfs is internally much simpler. 568979fc9aSMauro Carvalho Chehab 578979fc9aSMauro Carvalho ChehabAnother reason ramdisks are semi-obsolete is that the introduction of 588979fc9aSMauro Carvalho Chehabloopback devices offered a more flexible and convenient way to create 598979fc9aSMauro Carvalho Chehabsynthetic block devices, now from files instead of from chunks of memory. 608979fc9aSMauro Carvalho ChehabSee losetup (8) for details. 618979fc9aSMauro Carvalho Chehab 628979fc9aSMauro Carvalho Chehabramfs and tmpfs: 638979fc9aSMauro Carvalho Chehab---------------- 648979fc9aSMauro Carvalho Chehab 658979fc9aSMauro Carvalho ChehabOne downside of ramfs is you can keep writing data into it until you fill 668979fc9aSMauro Carvalho Chehabup all memory, and the VM can't free it because the VM thinks that files 678979fc9aSMauro Carvalho Chehabshould get written to backing store (rather than swap space), but ramfs hasn't 688979fc9aSMauro Carvalho Chehabgot any backing store. Because of this, only root (or a trusted user) should 698979fc9aSMauro Carvalho Chehabbe allowed write access to a ramfs mount. 708979fc9aSMauro Carvalho Chehab 718979fc9aSMauro Carvalho ChehabA ramfs derivative called tmpfs was created to add size limits, and the ability 728979fc9aSMauro Carvalho Chehabto write the data to swap space. Normal users can be allowed write access to 730c1bc6b8SMauro Carvalho Chehabtmpfs mounts. See Documentation/filesystems/tmpfs.rst for more information. 748979fc9aSMauro Carvalho Chehab 758979fc9aSMauro Carvalho ChehabWhat is rootfs? 768979fc9aSMauro Carvalho Chehab--------------- 778979fc9aSMauro Carvalho Chehab 788979fc9aSMauro Carvalho ChehabRootfs is a special instance of ramfs (or tmpfs, if that's enabled), which is 798979fc9aSMauro Carvalho Chehabalways present in 2.6 systems. You can't unmount rootfs for approximately the 808979fc9aSMauro Carvalho Chehabsame reason you can't kill the init process; rather than having special code 818979fc9aSMauro Carvalho Chehabto check for and handle an empty list, it's smaller and simpler for the kernel 828979fc9aSMauro Carvalho Chehabto just make sure certain lists can't become empty. 838979fc9aSMauro Carvalho Chehab 848979fc9aSMauro Carvalho ChehabMost systems just mount another filesystem over rootfs and ignore it. The 858979fc9aSMauro Carvalho Chehabamount of space an empty instance of ramfs takes up is tiny. 868979fc9aSMauro Carvalho Chehab 878979fc9aSMauro Carvalho ChehabIf CONFIG_TMPFS is enabled, rootfs will use tmpfs instead of ramfs by 888979fc9aSMauro Carvalho Chehabdefault. To force ramfs, add "rootfstype=ramfs" to the kernel command 898979fc9aSMauro Carvalho Chehabline. 908979fc9aSMauro Carvalho Chehab 918979fc9aSMauro Carvalho ChehabWhat is initramfs? 928979fc9aSMauro Carvalho Chehab------------------ 938979fc9aSMauro Carvalho Chehab 948979fc9aSMauro Carvalho ChehabAll 2.6 Linux kernels contain a gzipped "cpio" format archive, which is 958979fc9aSMauro Carvalho Chehabextracted into rootfs when the kernel boots up. After extracting, the kernel 968979fc9aSMauro Carvalho Chehabchecks to see if rootfs contains a file "init", and if so it executes it as PID 978979fc9aSMauro Carvalho Chehab1. If found, this init process is responsible for bringing the system the 988979fc9aSMauro Carvalho Chehabrest of the way up, including locating and mounting the real root device (if 998979fc9aSMauro Carvalho Chehabany). If rootfs does not contain an init program after the embedded cpio 1008979fc9aSMauro Carvalho Chehabarchive is extracted into it, the kernel will fall through to the older code 1018979fc9aSMauro Carvalho Chehabto locate and mount a root partition, then exec some variant of /sbin/init 1028979fc9aSMauro Carvalho Chehabout of that. 1038979fc9aSMauro Carvalho Chehab 1048979fc9aSMauro Carvalho ChehabAll this differs from the old initrd in several ways: 1058979fc9aSMauro Carvalho Chehab 1068979fc9aSMauro Carvalho Chehab - The old initrd was always a separate file, while the initramfs archive is 1078979fc9aSMauro Carvalho Chehab linked into the linux kernel image. (The directory ``linux-*/usr`` is 1088979fc9aSMauro Carvalho Chehab devoted to generating this archive during the build.) 1098979fc9aSMauro Carvalho Chehab 1108979fc9aSMauro Carvalho Chehab - The old initrd file was a gzipped filesystem image (in some file format, 1118979fc9aSMauro Carvalho Chehab such as ext2, that needed a driver built into the kernel), while the new 1128979fc9aSMauro Carvalho Chehab initramfs archive is a gzipped cpio archive (like tar only simpler, 1138979fc9aSMauro Carvalho Chehab see cpio(1) and Documentation/driver-api/early-userspace/buffer-format.rst). 1148979fc9aSMauro Carvalho Chehab The kernel's cpio extraction code is not only extremely small, it's also 1158979fc9aSMauro Carvalho Chehab __init text and data that can be discarded during the boot process. 1168979fc9aSMauro Carvalho Chehab 1178979fc9aSMauro Carvalho Chehab - The program run by the old initrd (which was called /initrd, not /init) did 1188979fc9aSMauro Carvalho Chehab some setup and then returned to the kernel, while the init program from 1198979fc9aSMauro Carvalho Chehab initramfs is not expected to return to the kernel. (If /init needs to hand 1208979fc9aSMauro Carvalho Chehab off control it can overmount / with a new root device and exec another init 1218979fc9aSMauro Carvalho Chehab program. See the switch_root utility, below.) 1228979fc9aSMauro Carvalho Chehab 1238979fc9aSMauro Carvalho Chehab - When switching another root device, initrd would pivot_root and then 1248979fc9aSMauro Carvalho Chehab umount the ramdisk. But initramfs is rootfs: you can neither pivot_root 1258979fc9aSMauro Carvalho Chehab rootfs, nor unmount it. Instead delete everything out of rootfs to 1268979fc9aSMauro Carvalho Chehab free up the space (find -xdev / -exec rm '{}' ';'), overmount rootfs 1278979fc9aSMauro Carvalho Chehab with the new root (cd /newmount; mount --move . /; chroot .), attach 1288979fc9aSMauro Carvalho Chehab stdin/stdout/stderr to the new /dev/console, and exec the new init. 1298979fc9aSMauro Carvalho Chehab 1308979fc9aSMauro Carvalho Chehab Since this is a remarkably persnickety process (and involves deleting 1318979fc9aSMauro Carvalho Chehab commands before you can run them), the klibc package introduced a helper 1328979fc9aSMauro Carvalho Chehab program (utils/run_init.c) to do all this for you. Most other packages 1338979fc9aSMauro Carvalho Chehab (such as busybox) have named this command "switch_root". 1348979fc9aSMauro Carvalho Chehab 1358979fc9aSMauro Carvalho ChehabPopulating initramfs: 1368979fc9aSMauro Carvalho Chehab--------------------- 1378979fc9aSMauro Carvalho Chehab 1388979fc9aSMauro Carvalho ChehabThe 2.6 kernel build process always creates a gzipped cpio format initramfs 1398979fc9aSMauro Carvalho Chehabarchive and links it into the resulting kernel binary. By default, this 1408979fc9aSMauro Carvalho Chehabarchive is empty (consuming 134 bytes on x86). 1418979fc9aSMauro Carvalho Chehab 1428979fc9aSMauro Carvalho ChehabThe config option CONFIG_INITRAMFS_SOURCE (in General Setup in menuconfig, 1438979fc9aSMauro Carvalho Chehaband living in usr/Kconfig) can be used to specify a source for the 1448979fc9aSMauro Carvalho Chehabinitramfs archive, which will automatically be incorporated into the 1458979fc9aSMauro Carvalho Chehabresulting binary. This option can point to an existing gzipped cpio 1468979fc9aSMauro Carvalho Chehabarchive, a directory containing files to be archived, or a text file 1478979fc9aSMauro Carvalho Chehabspecification such as the following example:: 1488979fc9aSMauro Carvalho Chehab 1498979fc9aSMauro Carvalho Chehab dir /dev 755 0 0 1508979fc9aSMauro Carvalho Chehab nod /dev/console 644 0 0 c 5 1 1518979fc9aSMauro Carvalho Chehab nod /dev/loop0 644 0 0 b 7 0 1528979fc9aSMauro Carvalho Chehab dir /bin 755 1000 1000 1538979fc9aSMauro Carvalho Chehab slink /bin/sh busybox 777 0 0 1548979fc9aSMauro Carvalho Chehab file /bin/busybox initramfs/busybox 755 0 0 1558979fc9aSMauro Carvalho Chehab dir /proc 755 0 0 1568979fc9aSMauro Carvalho Chehab dir /sys 755 0 0 1578979fc9aSMauro Carvalho Chehab dir /mnt 755 0 0 1588979fc9aSMauro Carvalho Chehab file /init initramfs/init.sh 755 0 0 1598979fc9aSMauro Carvalho Chehab 1608979fc9aSMauro Carvalho ChehabRun "usr/gen_init_cpio" (after the kernel build) to get a usage message 1618979fc9aSMauro Carvalho Chehabdocumenting the above file format. 1628979fc9aSMauro Carvalho Chehab 1638979fc9aSMauro Carvalho ChehabOne advantage of the configuration file is that root access is not required to 1648979fc9aSMauro Carvalho Chehabset permissions or create device nodes in the new archive. (Note that those 1658979fc9aSMauro Carvalho Chehabtwo example "file" entries expect to find files named "init.sh" and "busybox" in 1668979fc9aSMauro Carvalho Chehaba directory called "initramfs", under the linux-2.6.* directory. See 1678979fc9aSMauro Carvalho ChehabDocumentation/driver-api/early-userspace/early_userspace_support.rst for more details.) 1688979fc9aSMauro Carvalho Chehab 1698979fc9aSMauro Carvalho ChehabThe kernel does not depend on external cpio tools. If you specify a 1708979fc9aSMauro Carvalho Chehabdirectory instead of a configuration file, the kernel's build infrastructure 1718979fc9aSMauro Carvalho Chehabcreates a configuration file from that directory (usr/Makefile calls 1725e60f363SRobert Richterusr/gen_initramfs.sh), and proceeds to package up that directory 1738979fc9aSMauro Carvalho Chehabusing the config file (by feeding it to usr/gen_init_cpio, which is created 1748979fc9aSMauro Carvalho Chehabfrom usr/gen_init_cpio.c). The kernel's build-time cpio creation code is 1758979fc9aSMauro Carvalho Chehabentirely self-contained, and the kernel's boot-time extractor is also 1768979fc9aSMauro Carvalho Chehab(obviously) self-contained. 1778979fc9aSMauro Carvalho Chehab 1788979fc9aSMauro Carvalho ChehabThe one thing you might need external cpio utilities installed for is creating 1798979fc9aSMauro Carvalho Chehabor extracting your own preprepared cpio files to feed to the kernel build 1808979fc9aSMauro Carvalho Chehab(instead of a config file or directory). 1818979fc9aSMauro Carvalho Chehab 1828979fc9aSMauro Carvalho ChehabThe following command line can extract a cpio image (either by the above script 1838979fc9aSMauro Carvalho Chehabor by the kernel build) back into its component files:: 1848979fc9aSMauro Carvalho Chehab 1858979fc9aSMauro Carvalho Chehab cpio -i -d -H newc -F initramfs_data.cpio --no-absolute-filenames 1868979fc9aSMauro Carvalho Chehab 1878979fc9aSMauro Carvalho ChehabThe following shell script can create a prebuilt cpio archive you can 1888979fc9aSMauro Carvalho Chehabuse in place of the above config file:: 1898979fc9aSMauro Carvalho Chehab 1908979fc9aSMauro Carvalho Chehab #!/bin/sh 1918979fc9aSMauro Carvalho Chehab 1928979fc9aSMauro Carvalho Chehab # Copyright 2006 Rob Landley <rob@landley.net> and TimeSys Corporation. 1938979fc9aSMauro Carvalho Chehab # Licensed under GPL version 2 1948979fc9aSMauro Carvalho Chehab 1958979fc9aSMauro Carvalho Chehab if [ $# -ne 2 ] 1968979fc9aSMauro Carvalho Chehab then 1978979fc9aSMauro Carvalho Chehab echo "usage: mkinitramfs directory imagename.cpio.gz" 1988979fc9aSMauro Carvalho Chehab exit 1 1998979fc9aSMauro Carvalho Chehab fi 2008979fc9aSMauro Carvalho Chehab 2018979fc9aSMauro Carvalho Chehab if [ -d "$1" ] 2028979fc9aSMauro Carvalho Chehab then 2038979fc9aSMauro Carvalho Chehab echo "creating $2 from $1" 2048979fc9aSMauro Carvalho Chehab (cd "$1"; find . | cpio -o -H newc | gzip) > "$2" 2058979fc9aSMauro Carvalho Chehab else 2068979fc9aSMauro Carvalho Chehab echo "First argument must be a directory" 2078979fc9aSMauro Carvalho Chehab exit 1 2088979fc9aSMauro Carvalho Chehab fi 2098979fc9aSMauro Carvalho Chehab 2108979fc9aSMauro Carvalho Chehab.. Note:: 2118979fc9aSMauro Carvalho Chehab 2128979fc9aSMauro Carvalho Chehab The cpio man page contains some bad advice that will break your initramfs 2138979fc9aSMauro Carvalho Chehab archive if you follow it. It says "A typical way to generate the list 2148979fc9aSMauro Carvalho Chehab of filenames is with the find command; you should give find the -depth 2158979fc9aSMauro Carvalho Chehab option to minimize problems with permissions on directories that are 2168979fc9aSMauro Carvalho Chehab unwritable or not searchable." Don't do this when creating 2178979fc9aSMauro Carvalho Chehab initramfs.cpio.gz images, it won't work. The Linux kernel cpio extractor 2188979fc9aSMauro Carvalho Chehab won't create files in a directory that doesn't exist, so the directory 2198979fc9aSMauro Carvalho Chehab entries must go before the files that go in those directories. 2208979fc9aSMauro Carvalho Chehab The above script gets them in the right order. 2218979fc9aSMauro Carvalho Chehab 2228979fc9aSMauro Carvalho ChehabExternal initramfs images: 2238979fc9aSMauro Carvalho Chehab-------------------------- 2248979fc9aSMauro Carvalho Chehab 2258979fc9aSMauro Carvalho ChehabIf the kernel has initrd support enabled, an external cpio.gz archive can also 2268979fc9aSMauro Carvalho Chehabbe passed into a 2.6 kernel in place of an initrd. In this case, the kernel 2278979fc9aSMauro Carvalho Chehabwill autodetect the type (initramfs, not initrd) and extract the external cpio 2288979fc9aSMauro Carvalho Chehabarchive into rootfs before trying to run /init. 2298979fc9aSMauro Carvalho Chehab 2308979fc9aSMauro Carvalho ChehabThis has the memory efficiency advantages of initramfs (no ramdisk block 2318979fc9aSMauro Carvalho Chehabdevice) but the separate packaging of initrd (which is nice if you have 2328979fc9aSMauro Carvalho Chehabnon-GPL code you'd like to run from initramfs, without conflating it with 2338979fc9aSMauro Carvalho Chehabthe GPL licensed Linux kernel binary). 2348979fc9aSMauro Carvalho Chehab 2358979fc9aSMauro Carvalho ChehabIt can also be used to supplement the kernel's built-in initramfs image. The 2368979fc9aSMauro Carvalho Chehabfiles in the external archive will overwrite any conflicting files in 2378979fc9aSMauro Carvalho Chehabthe built-in initramfs archive. Some distributors also prefer to customize 2388979fc9aSMauro Carvalho Chehaba single kernel image with task-specific initramfs images, without recompiling. 2398979fc9aSMauro Carvalho Chehab 2408979fc9aSMauro Carvalho ChehabContents of initramfs: 2418979fc9aSMauro Carvalho Chehab---------------------- 2428979fc9aSMauro Carvalho Chehab 2438979fc9aSMauro Carvalho ChehabAn initramfs archive is a complete self-contained root filesystem for Linux. 2448979fc9aSMauro Carvalho ChehabIf you don't already understand what shared libraries, devices, and paths 2458979fc9aSMauro Carvalho Chehabyou need to get a minimal root filesystem up and running, here are some 2468979fc9aSMauro Carvalho Chehabreferences: 2478979fc9aSMauro Carvalho Chehab 248c69f22f2SAlexander A. Klimov- https://www.tldp.org/HOWTO/Bootdisk-HOWTO/ 249c69f22f2SAlexander A. Klimov- https://www.tldp.org/HOWTO/From-PowerUp-To-Bash-Prompt-HOWTO.html 2508979fc9aSMauro Carvalho Chehab- http://www.linuxfromscratch.org/lfs/view/stable/ 2518979fc9aSMauro Carvalho Chehab 252c69f22f2SAlexander A. KlimovThe "klibc" package (https://www.kernel.org/pub/linux/libs/klibc) is 2538979fc9aSMauro Carvalho Chehabdesigned to be a tiny C library to statically link early userspace 2548979fc9aSMauro Carvalho Chehabcode against, along with some related utilities. It is BSD licensed. 2558979fc9aSMauro Carvalho Chehab 256c69f22f2SAlexander A. KlimovI use uClibc (https://www.uclibc.org) and busybox (https://www.busybox.net) 2578979fc9aSMauro Carvalho Chehabmyself. These are LGPL and GPL, respectively. (A self-contained initramfs 2588979fc9aSMauro Carvalho Chehabpackage is planned for the busybox 1.3 release.) 2598979fc9aSMauro Carvalho Chehab 2608979fc9aSMauro Carvalho ChehabIn theory you could use glibc, but that's not well suited for small embedded 2618979fc9aSMauro Carvalho Chehabuses like this. (A "hello world" program statically linked against glibc is 2628979fc9aSMauro Carvalho Chehabover 400k. With uClibc it's 7k. Also note that glibc dlopens libnss to do 2638979fc9aSMauro Carvalho Chehabname lookups, even when otherwise statically linked.) 2648979fc9aSMauro Carvalho Chehab 2658979fc9aSMauro Carvalho ChehabA good first step is to get initramfs to run a statically linked "hello world" 2668979fc9aSMauro Carvalho Chehabprogram as init, and test it under an emulator like qemu (www.qemu.org) or 2678979fc9aSMauro Carvalho ChehabUser Mode Linux, like so:: 2688979fc9aSMauro Carvalho Chehab 2698979fc9aSMauro Carvalho Chehab cat > hello.c << EOF 2708979fc9aSMauro Carvalho Chehab #include <stdio.h> 2718979fc9aSMauro Carvalho Chehab #include <unistd.h> 2728979fc9aSMauro Carvalho Chehab 2738979fc9aSMauro Carvalho Chehab int main(int argc, char *argv[]) 2748979fc9aSMauro Carvalho Chehab { 2758979fc9aSMauro Carvalho Chehab printf("Hello world!\n"); 2768979fc9aSMauro Carvalho Chehab sleep(999999999); 2778979fc9aSMauro Carvalho Chehab } 2788979fc9aSMauro Carvalho Chehab EOF 2798979fc9aSMauro Carvalho Chehab gcc -static hello.c -o init 2808979fc9aSMauro Carvalho Chehab echo init | cpio -o -H newc | gzip > test.cpio.gz 2818979fc9aSMauro Carvalho Chehab # Testing external initramfs using the initrd loading mechanism. 2828979fc9aSMauro Carvalho Chehab qemu -kernel /boot/vmlinuz -initrd test.cpio.gz /dev/zero 2838979fc9aSMauro Carvalho Chehab 2848979fc9aSMauro Carvalho ChehabWhen debugging a normal root filesystem, it's nice to be able to boot with 2858979fc9aSMauro Carvalho Chehab"init=/bin/sh". The initramfs equivalent is "rdinit=/bin/sh", and it's 2868979fc9aSMauro Carvalho Chehabjust as useful. 2878979fc9aSMauro Carvalho Chehab 2888979fc9aSMauro Carvalho ChehabWhy cpio rather than tar? 2898979fc9aSMauro Carvalho Chehab------------------------- 2908979fc9aSMauro Carvalho Chehab 2918979fc9aSMauro Carvalho ChehabThis decision was made back in December, 2001. The discussion started here: 2928979fc9aSMauro Carvalho Chehab 2938979fc9aSMauro Carvalho Chehab http://www.uwsg.iu.edu/hypermail/linux/kernel/0112.2/1538.html 2948979fc9aSMauro Carvalho Chehab 2958979fc9aSMauro Carvalho ChehabAnd spawned a second thread (specifically on tar vs cpio), starting here: 2968979fc9aSMauro Carvalho Chehab 2978979fc9aSMauro Carvalho Chehab http://www.uwsg.iu.edu/hypermail/linux/kernel/0112.2/1587.html 2988979fc9aSMauro Carvalho Chehab 2998979fc9aSMauro Carvalho ChehabThe quick and dirty summary version (which is no substitute for reading 3008979fc9aSMauro Carvalho Chehabthe above threads) is: 3018979fc9aSMauro Carvalho Chehab 3028979fc9aSMauro Carvalho Chehab1) cpio is a standard. It's decades old (from the AT&T days), and already 3038979fc9aSMauro Carvalho Chehab widely used on Linux (inside RPM, Red Hat's device driver disks). Here's 3048979fc9aSMauro Carvalho Chehab a Linux Journal article about it from 1996: 3058979fc9aSMauro Carvalho Chehab 3068979fc9aSMauro Carvalho Chehab http://www.linuxjournal.com/article/1213 3078979fc9aSMauro Carvalho Chehab 3088979fc9aSMauro Carvalho Chehab It's not as popular as tar because the traditional cpio command line tools 3098979fc9aSMauro Carvalho Chehab require _truly_hideous_ command line arguments. But that says nothing 3108979fc9aSMauro Carvalho Chehab either way about the archive format, and there are alternative tools, 3118979fc9aSMauro Carvalho Chehab such as: 3128979fc9aSMauro Carvalho Chehab 3138979fc9aSMauro Carvalho Chehab http://freecode.com/projects/afio 3148979fc9aSMauro Carvalho Chehab 3158979fc9aSMauro Carvalho Chehab2) The cpio archive format chosen by the kernel is simpler and cleaner (and 3168979fc9aSMauro Carvalho Chehab thus easier to create and parse) than any of the (literally dozens of) 3178979fc9aSMauro Carvalho Chehab various tar archive formats. The complete initramfs archive format is 3188979fc9aSMauro Carvalho Chehab explained in buffer-format.txt, created in usr/gen_init_cpio.c, and 3198979fc9aSMauro Carvalho Chehab extracted in init/initramfs.c. All three together come to less than 26k 3208979fc9aSMauro Carvalho Chehab total of human-readable text. 3218979fc9aSMauro Carvalho Chehab 3228979fc9aSMauro Carvalho Chehab3) The GNU project standardizing on tar is approximately as relevant as 3238979fc9aSMauro Carvalho Chehab Windows standardizing on zip. Linux is not part of either, and is free 3248979fc9aSMauro Carvalho Chehab to make its own technical decisions. 3258979fc9aSMauro Carvalho Chehab 3268979fc9aSMauro Carvalho Chehab4) Since this is a kernel internal format, it could easily have been 3278979fc9aSMauro Carvalho Chehab something brand new. The kernel provides its own tools to create and 3288979fc9aSMauro Carvalho Chehab extract this format anyway. Using an existing standard was preferable, 3298979fc9aSMauro Carvalho Chehab but not essential. 3308979fc9aSMauro Carvalho Chehab 3318979fc9aSMauro Carvalho Chehab5) Al Viro made the decision (quote: "tar is ugly as hell and not going to be 3328979fc9aSMauro Carvalho Chehab supported on the kernel side"): 3338979fc9aSMauro Carvalho Chehab 3348979fc9aSMauro Carvalho Chehab http://www.uwsg.iu.edu/hypermail/linux/kernel/0112.2/1540.html 3358979fc9aSMauro Carvalho Chehab 3368979fc9aSMauro Carvalho Chehab explained his reasoning: 3378979fc9aSMauro Carvalho Chehab 3388979fc9aSMauro Carvalho Chehab - http://www.uwsg.iu.edu/hypermail/linux/kernel/0112.2/1550.html 3398979fc9aSMauro Carvalho Chehab - http://www.uwsg.iu.edu/hypermail/linux/kernel/0112.2/1638.html 3408979fc9aSMauro Carvalho Chehab 3418979fc9aSMauro Carvalho Chehab and, most importantly, designed and implemented the initramfs code. 3428979fc9aSMauro Carvalho Chehab 3438979fc9aSMauro Carvalho ChehabFuture directions: 3448979fc9aSMauro Carvalho Chehab------------------ 3458979fc9aSMauro Carvalho Chehab 3468979fc9aSMauro Carvalho ChehabToday (2.6.16), initramfs is always compiled in, but not always used. The 3478979fc9aSMauro Carvalho Chehabkernel falls back to legacy boot code that is reached only if initramfs does 3488979fc9aSMauro Carvalho Chehabnot contain an /init program. The fallback is legacy code, there to ensure a 3498979fc9aSMauro Carvalho Chehabsmooth transition and allowing early boot functionality to gradually move to 3508979fc9aSMauro Carvalho Chehab"early userspace" (I.E. initramfs). 3518979fc9aSMauro Carvalho Chehab 3528979fc9aSMauro Carvalho ChehabThe move to early userspace is necessary because finding and mounting the real 3538979fc9aSMauro Carvalho Chehabroot device is complex. Root partitions can span multiple devices (raid or 3548979fc9aSMauro Carvalho Chehabseparate journal). They can be out on the network (requiring dhcp, setting a 3558979fc9aSMauro Carvalho Chehabspecific MAC address, logging into a server, etc). They can live on removable 3568979fc9aSMauro Carvalho Chehabmedia, with dynamically allocated major/minor numbers and persistent naming 3578979fc9aSMauro Carvalho Chehabissues requiring a full udev implementation to sort out. They can be 3588979fc9aSMauro Carvalho Chehabcompressed, encrypted, copy-on-write, loopback mounted, strangely partitioned, 3598979fc9aSMauro Carvalho Chehaband so on. 3608979fc9aSMauro Carvalho Chehab 3618979fc9aSMauro Carvalho ChehabThis kind of complexity (which inevitably includes policy) is rightly handled 3628979fc9aSMauro Carvalho Chehabin userspace. Both klibc and busybox/uClibc are working on simple initramfs 3638979fc9aSMauro Carvalho Chehabpackages to drop into a kernel build. 3648979fc9aSMauro Carvalho Chehab 3658979fc9aSMauro Carvalho ChehabThe klibc package has now been accepted into Andrew Morton's 2.6.17-mm tree. 3668979fc9aSMauro Carvalho ChehabThe kernel's current early boot code (partition detection, etc) will probably 3678979fc9aSMauro Carvalho Chehabbe migrated into a default initramfs, automatically created and used by the 3688979fc9aSMauro Carvalho Chehabkernel build. 369