12640c19dSMauro Carvalho Chehab.. SPDX-License-Identifier: GPL-2.0
22640c19dSMauro Carvalho Chehab
32640c19dSMauro Carvalho Chehab======
42640c19dSMauro Carvalho ChehabNILFS2
52640c19dSMauro Carvalho Chehab======
62640c19dSMauro Carvalho Chehab
72640c19dSMauro Carvalho ChehabNILFS2 is a log-structured file system (LFS) supporting continuous
82640c19dSMauro Carvalho Chehabsnapshotting.  In addition to versioning capability of the entire file
92640c19dSMauro Carvalho Chehabsystem, users can even restore files mistakenly overwritten or
102640c19dSMauro Carvalho Chehabdestroyed just a few seconds ago.  Since NILFS2 can keep consistency
112640c19dSMauro Carvalho Chehablike conventional LFS, it achieves quick recovery after system
122640c19dSMauro Carvalho Chehabcrashes.
132640c19dSMauro Carvalho Chehab
142640c19dSMauro Carvalho ChehabNILFS2 creates a number of checkpoints every few seconds or per
152640c19dSMauro Carvalho Chehabsynchronous write basis (unless there is no change).  Users can select
162640c19dSMauro Carvalho Chehabsignificant versions among continuously created checkpoints, and can
172640c19dSMauro Carvalho Chehabchange them into snapshots which will be preserved until they are
182640c19dSMauro Carvalho Chehabchanged back to checkpoints.
192640c19dSMauro Carvalho Chehab
202640c19dSMauro Carvalho ChehabThere is no limit on the number of snapshots until the volume gets
212640c19dSMauro Carvalho Chehabfull.  Each snapshot is mountable as a read-only file system
222640c19dSMauro Carvalho Chehabconcurrently with its writable mount, and this feature is convenient
232640c19dSMauro Carvalho Chehabfor online backup.
242640c19dSMauro Carvalho Chehab
252640c19dSMauro Carvalho ChehabThe userland tools are included in nilfs-utils package, which is
262640c19dSMauro Carvalho Chehabavailable from the following download page.  At least "mkfs.nilfs2",
272640c19dSMauro Carvalho Chehab"mount.nilfs2", "umount.nilfs2", and "nilfs_cleanerd" (so called
282640c19dSMauro Carvalho Chehabcleaner or garbage collector) are required.  Details on the tools are
292640c19dSMauro Carvalho Chehabdescribed in the man pages included in the package.
302640c19dSMauro Carvalho Chehab
312640c19dSMauro Carvalho Chehab:Project web page:    https://nilfs.sourceforge.io/
322640c19dSMauro Carvalho Chehab:Download page:       https://nilfs.sourceforge.io/en/download.html
332640c19dSMauro Carvalho Chehab:List info:           http://vger.kernel.org/vger-lists.html#linux-nilfs
342640c19dSMauro Carvalho Chehab
352640c19dSMauro Carvalho ChehabCaveats
362640c19dSMauro Carvalho Chehab=======
372640c19dSMauro Carvalho Chehab
382640c19dSMauro Carvalho ChehabFeatures which NILFS2 does not support yet:
392640c19dSMauro Carvalho Chehab
402640c19dSMauro Carvalho Chehab	- atime
412640c19dSMauro Carvalho Chehab	- extended attributes
422640c19dSMauro Carvalho Chehab	- POSIX ACLs
432640c19dSMauro Carvalho Chehab	- quotas
442640c19dSMauro Carvalho Chehab	- fsck
452640c19dSMauro Carvalho Chehab	- defragmentation
462640c19dSMauro Carvalho Chehab
472640c19dSMauro Carvalho ChehabMount options
482640c19dSMauro Carvalho Chehab=============
492640c19dSMauro Carvalho Chehab
502640c19dSMauro Carvalho ChehabNILFS2 supports the following mount options:
512640c19dSMauro Carvalho Chehab(*) == default
522640c19dSMauro Carvalho Chehab
532640c19dSMauro Carvalho Chehab======================= =======================================================
542640c19dSMauro Carvalho Chehabbarrier(*)		This enables/disables the use of write barriers.  This
552640c19dSMauro Carvalho Chehabnobarrier		requires an IO stack which can support barriers, and
562640c19dSMauro Carvalho Chehab			if nilfs gets an error on a barrier write, it will
572640c19dSMauro Carvalho Chehab			disable again with a warning.
582640c19dSMauro Carvalho Chehaberrors=continue		Keep going on a filesystem error.
592640c19dSMauro Carvalho Chehaberrors=remount-ro(*)	Remount the filesystem read-only on an error.
602640c19dSMauro Carvalho Chehaberrors=panic		Panic and halt the machine if an error occurs.
612640c19dSMauro Carvalho Chehabcp=n			Specify the checkpoint-number of the snapshot to be
622640c19dSMauro Carvalho Chehab			mounted.  Checkpoints and snapshots are listed by lscp
632640c19dSMauro Carvalho Chehab			user command.  Only the checkpoints marked as snapshot
642640c19dSMauro Carvalho Chehab			are mountable with this option.  Snapshot is read-only,
652640c19dSMauro Carvalho Chehab			so a read-only mount option must be specified together.
662640c19dSMauro Carvalho Chehaborder=relaxed(*)	Apply relaxed order semantics that allows modified data
672640c19dSMauro Carvalho Chehab			blocks to be written to disk without making a
682640c19dSMauro Carvalho Chehab			checkpoint if no metadata update is going.  This mode
692640c19dSMauro Carvalho Chehab			is equivalent to the ordered data mode of the ext3
702640c19dSMauro Carvalho Chehab			filesystem except for the updates on data blocks still
712640c19dSMauro Carvalho Chehab			conserve atomicity.  This will improve synchronous
722640c19dSMauro Carvalho Chehab			write performance for overwriting.
732640c19dSMauro Carvalho Chehaborder=strict		Apply strict in-order semantics that preserves sequence
742640c19dSMauro Carvalho Chehab			of all file operations including overwriting of data
752640c19dSMauro Carvalho Chehab			blocks.  That means, it is guaranteed that no
762640c19dSMauro Carvalho Chehab			overtaking of events occurs in the recovered file
772640c19dSMauro Carvalho Chehab			system after a crash.
782640c19dSMauro Carvalho Chehabnorecovery		Disable recovery of the filesystem on mount.
792640c19dSMauro Carvalho Chehab			This disables every write access on the device for
802640c19dSMauro Carvalho Chehab			read-only mounts or snapshots.  This option will fail
812640c19dSMauro Carvalho Chehab			for r/w mounts on an unclean volume.
822640c19dSMauro Carvalho Chehabdiscard			This enables/disables the use of discard/TRIM commands.
832640c19dSMauro Carvalho Chehabnodiscard(*)		The discard/TRIM commands are sent to the underlying
842640c19dSMauro Carvalho Chehab			block device when blocks are freed.  This is useful
852640c19dSMauro Carvalho Chehab			for SSD devices and sparse/thinly-provisioned LUNs.
862640c19dSMauro Carvalho Chehab======================= =======================================================
872640c19dSMauro Carvalho Chehab
882640c19dSMauro Carvalho ChehabIoctls
892640c19dSMauro Carvalho Chehab======
902640c19dSMauro Carvalho Chehab
912640c19dSMauro Carvalho ChehabThere is some NILFS2 specific functionality which can be accessed by applications
922640c19dSMauro Carvalho Chehabthrough the system call interfaces. The list of all NILFS2 specific ioctls are
932640c19dSMauro Carvalho Chehabshown in the table below.
942640c19dSMauro Carvalho Chehab
952640c19dSMauro Carvalho ChehabTable of NILFS2 specific ioctls:
962640c19dSMauro Carvalho Chehab
972640c19dSMauro Carvalho Chehab ============================== ===============================================
982640c19dSMauro Carvalho Chehab Ioctl			        Description
992640c19dSMauro Carvalho Chehab ============================== ===============================================
1002640c19dSMauro Carvalho Chehab NILFS_IOCTL_CHANGE_CPMODE      Change mode of given checkpoint between
1012640c19dSMauro Carvalho Chehab			        checkpoint and snapshot state. This ioctl is
1022640c19dSMauro Carvalho Chehab			        used in chcp and mkcp utilities.
1032640c19dSMauro Carvalho Chehab
1042640c19dSMauro Carvalho Chehab NILFS_IOCTL_DELETE_CHECKPOINT  Remove checkpoint from NILFS2 file system.
1052640c19dSMauro Carvalho Chehab			        This ioctl is used in rmcp utility.
1062640c19dSMauro Carvalho Chehab
1072640c19dSMauro Carvalho Chehab NILFS_IOCTL_GET_CPINFO         Return info about requested checkpoints. This
1082640c19dSMauro Carvalho Chehab			        ioctl is used in lscp utility and by
1092640c19dSMauro Carvalho Chehab			        nilfs_cleanerd daemon.
1102640c19dSMauro Carvalho Chehab
1112640c19dSMauro Carvalho Chehab NILFS_IOCTL_GET_CPSTAT         Return checkpoints statistics. This ioctl is
1122640c19dSMauro Carvalho Chehab			        used by lscp, rmcp utilities and by
1132640c19dSMauro Carvalho Chehab			        nilfs_cleanerd daemon.
1142640c19dSMauro Carvalho Chehab
1152640c19dSMauro Carvalho Chehab NILFS_IOCTL_GET_SUINFO         Return segment usage info about requested
1162640c19dSMauro Carvalho Chehab			        segments. This ioctl is used in lssu,
1172640c19dSMauro Carvalho Chehab			        nilfs_resize utilities and by nilfs_cleanerd
1182640c19dSMauro Carvalho Chehab			        daemon.
1192640c19dSMauro Carvalho Chehab
1202640c19dSMauro Carvalho Chehab NILFS_IOCTL_SET_SUINFO         Modify segment usage info of requested
1212640c19dSMauro Carvalho Chehab				segments. This ioctl is used by
1222640c19dSMauro Carvalho Chehab				nilfs_cleanerd daemon to skip unnecessary
1232640c19dSMauro Carvalho Chehab				cleaning operation of segments and reduce
1242640c19dSMauro Carvalho Chehab				performance penalty or wear of flash device
1252640c19dSMauro Carvalho Chehab				due to redundant move of in-use blocks.
1262640c19dSMauro Carvalho Chehab
1272640c19dSMauro Carvalho Chehab NILFS_IOCTL_GET_SUSTAT         Return segment usage statistics. This ioctl
1282640c19dSMauro Carvalho Chehab			        is used in lssu, nilfs_resize utilities and
1292640c19dSMauro Carvalho Chehab			        by nilfs_cleanerd daemon.
1302640c19dSMauro Carvalho Chehab
1312640c19dSMauro Carvalho Chehab NILFS_IOCTL_GET_VINFO          Return information on virtual block addresses.
1322640c19dSMauro Carvalho Chehab			        This ioctl is used by nilfs_cleanerd daemon.
1332640c19dSMauro Carvalho Chehab
1342640c19dSMauro Carvalho Chehab NILFS_IOCTL_GET_BDESCS         Return information about descriptors of disk
1352640c19dSMauro Carvalho Chehab			        block numbers. This ioctl is used by
1362640c19dSMauro Carvalho Chehab			        nilfs_cleanerd daemon.
1372640c19dSMauro Carvalho Chehab
1382640c19dSMauro Carvalho Chehab NILFS_IOCTL_CLEAN_SEGMENTS     Do garbage collection operation in the
1392640c19dSMauro Carvalho Chehab			        environment of requested parameters from
1402640c19dSMauro Carvalho Chehab			        userspace. This ioctl is used by
1412640c19dSMauro Carvalho Chehab			        nilfs_cleanerd daemon.
1422640c19dSMauro Carvalho Chehab
1432640c19dSMauro Carvalho Chehab NILFS_IOCTL_SYNC               Make a checkpoint. This ioctl is used in
1442640c19dSMauro Carvalho Chehab			        mkcp utility.
1452640c19dSMauro Carvalho Chehab
1462640c19dSMauro Carvalho Chehab NILFS_IOCTL_RESIZE             Resize NILFS2 volume. This ioctl is used
1472640c19dSMauro Carvalho Chehab			        by nilfs_resize utility.
1482640c19dSMauro Carvalho Chehab
1492640c19dSMauro Carvalho Chehab NILFS_IOCTL_SET_ALLOC_RANGE    Define lower limit of segments in bytes and
1502640c19dSMauro Carvalho Chehab			        upper limit of segments in bytes. This ioctl
1512640c19dSMauro Carvalho Chehab			        is used by nilfs_resize utility.
1522640c19dSMauro Carvalho Chehab ============================== ===============================================
1532640c19dSMauro Carvalho Chehab
1542640c19dSMauro Carvalho ChehabNILFS2 usage
1552640c19dSMauro Carvalho Chehab============
1562640c19dSMauro Carvalho Chehab
1572640c19dSMauro Carvalho ChehabTo use nilfs2 as a local file system, simply::
1582640c19dSMauro Carvalho Chehab
1592640c19dSMauro Carvalho Chehab # mkfs -t nilfs2 /dev/block_device
1602640c19dSMauro Carvalho Chehab # mount -t nilfs2 /dev/block_device /dir
1612640c19dSMauro Carvalho Chehab
1622640c19dSMauro Carvalho ChehabThis will also invoke the cleaner through the mount helper program
1632640c19dSMauro Carvalho Chehab(mount.nilfs2).
1642640c19dSMauro Carvalho Chehab
1652640c19dSMauro Carvalho ChehabCheckpoints and snapshots are managed by the following commands.
1662640c19dSMauro Carvalho ChehabTheir manpages are included in the nilfs-utils package above.
1672640c19dSMauro Carvalho Chehab
1682640c19dSMauro Carvalho Chehab  ====     ===========================================================
1692640c19dSMauro Carvalho Chehab  lscp     list checkpoints or snapshots.
1702640c19dSMauro Carvalho Chehab  mkcp     make a checkpoint or a snapshot.
1712640c19dSMauro Carvalho Chehab  chcp     change an existing checkpoint to a snapshot or vice versa.
1722640c19dSMauro Carvalho Chehab  rmcp     invalidate specified checkpoint(s).
1732640c19dSMauro Carvalho Chehab  ====     ===========================================================
1742640c19dSMauro Carvalho Chehab
1752640c19dSMauro Carvalho ChehabTo mount a snapshot::
1762640c19dSMauro Carvalho Chehab
1772640c19dSMauro Carvalho Chehab # mount -t nilfs2 -r -o cp=<cno> /dev/block_device /snap_dir
1782640c19dSMauro Carvalho Chehab
1792640c19dSMauro Carvalho Chehabwhere <cno> is the checkpoint number of the snapshot.
1802640c19dSMauro Carvalho Chehab
1812640c19dSMauro Carvalho ChehabTo unmount the NILFS2 mount point or snapshot, simply::
1822640c19dSMauro Carvalho Chehab
1832640c19dSMauro Carvalho Chehab # umount /dir
1842640c19dSMauro Carvalho Chehab
1852640c19dSMauro Carvalho ChehabThen, the cleaner daemon is automatically shut down by the umount
1862640c19dSMauro Carvalho Chehabhelper program (umount.nilfs2).
1872640c19dSMauro Carvalho Chehab
1882640c19dSMauro Carvalho ChehabDisk format
1892640c19dSMauro Carvalho Chehab===========
1902640c19dSMauro Carvalho Chehab
1912640c19dSMauro Carvalho ChehabA nilfs2 volume is equally divided into a number of segments except
1922640c19dSMauro Carvalho Chehabfor the super block (SB) and segment #0.  A segment is the container
1932640c19dSMauro Carvalho Chehabof logs.  Each log is composed of summary information blocks, payload
1942640c19dSMauro Carvalho Chehabblocks, and an optional super root block (SR)::
1952640c19dSMauro Carvalho Chehab
1962640c19dSMauro Carvalho Chehab   ______________________________________________________
1972640c19dSMauro Carvalho Chehab  | |SB| | Segment | Segment | Segment | ... | Segment | |
1982640c19dSMauro Carvalho Chehab  |_|__|_|____0____|____1____|____2____|_____|____N____|_|
1992640c19dSMauro Carvalho Chehab  0 +1K +4K       +8M       +16M      +24M  +(8MB x N)
2002640c19dSMauro Carvalho Chehab       .             .            (Typical offsets for 4KB-block)
2012640c19dSMauro Carvalho Chehab    .                  .
2022640c19dSMauro Carvalho Chehab  .______________________.
2032640c19dSMauro Carvalho Chehab  | log | log |... | log |
2042640c19dSMauro Carvalho Chehab  |__1__|__2__|____|__m__|
2052640c19dSMauro Carvalho Chehab        .       .
2062640c19dSMauro Carvalho Chehab      .               .
2072640c19dSMauro Carvalho Chehab    .                       .
2082640c19dSMauro Carvalho Chehab  .______________________________.
2092640c19dSMauro Carvalho Chehab  | Summary | Payload blocks  |SR|
2102640c19dSMauro Carvalho Chehab  |_blocks__|_________________|__|
2112640c19dSMauro Carvalho Chehab
2122640c19dSMauro Carvalho ChehabThe payload blocks are organized per file, and each file consists of
2132640c19dSMauro Carvalho Chehabdata blocks and B-tree node blocks::
2142640c19dSMauro Carvalho Chehab
2152640c19dSMauro Carvalho Chehab    |<---       File-A        --->|<---       File-B        --->|
2162640c19dSMauro Carvalho Chehab   _______________________________________________________________
2172640c19dSMauro Carvalho Chehab    | Data blocks | B-tree blocks | Data blocks | B-tree blocks | ...
2182640c19dSMauro Carvalho Chehab   _|_____________|_______________|_____________|_______________|_
2192640c19dSMauro Carvalho Chehab
2202640c19dSMauro Carvalho Chehab
2212640c19dSMauro Carvalho ChehabSince only the modified blocks are written in the log, it may have
2222640c19dSMauro Carvalho Chehabfiles without data blocks or B-tree node blocks.
2232640c19dSMauro Carvalho Chehab
2242640c19dSMauro Carvalho ChehabThe organization of the blocks is recorded in the summary information
2252640c19dSMauro Carvalho Chehabblocks, which contains a header structure (nilfs_segment_summary), per
2262640c19dSMauro Carvalho Chehabfile structures (nilfs_finfo), and per block structures (nilfs_binfo)::
2272640c19dSMauro Carvalho Chehab
2282640c19dSMauro Carvalho Chehab  _________________________________________________________________________
2292640c19dSMauro Carvalho Chehab | Summary | finfo | binfo | ... | binfo | finfo | binfo | ... | binfo |...
2302640c19dSMauro Carvalho Chehab |_blocks__|___A___|_(A,1)_|_____|(A,Na)_|___B___|_(B,1)_|_____|(B,Nb)_|___
2312640c19dSMauro Carvalho Chehab
2322640c19dSMauro Carvalho Chehab
2332640c19dSMauro Carvalho ChehabThe logs include regular files, directory files, symbolic link files
234*d56b699dSBjorn Helgaasand several meta data files.  The meta data files are the files used
2352640c19dSMauro Carvalho Chehabto maintain file system meta data.  The current version of NILFS2 uses
2362640c19dSMauro Carvalho Chehabthe following meta data files::
2372640c19dSMauro Carvalho Chehab
2382640c19dSMauro Carvalho Chehab 1) Inode file (ifile)             -- Stores on-disk inodes
2392640c19dSMauro Carvalho Chehab 2) Checkpoint file (cpfile)       -- Stores checkpoints
2402640c19dSMauro Carvalho Chehab 3) Segment usage file (sufile)    -- Stores allocation state of segments
2412640c19dSMauro Carvalho Chehab 4) Data address translation file  -- Maps virtual block numbers to usual
2422640c19dSMauro Carvalho Chehab    (DAT)                             block numbers.  This file serves to
2432640c19dSMauro Carvalho Chehab                                      make on-disk blocks relocatable.
2442640c19dSMauro Carvalho Chehab
2452640c19dSMauro Carvalho ChehabThe following figure shows a typical organization of the logs::
2462640c19dSMauro Carvalho Chehab
2472640c19dSMauro Carvalho Chehab  _________________________________________________________________________
2482640c19dSMauro Carvalho Chehab | Summary | regular file | file  | ... | ifile | cpfile | sufile | DAT |SR|
2492640c19dSMauro Carvalho Chehab |_blocks__|_or_directory_|_______|_____|_______|________|________|_____|__|
2502640c19dSMauro Carvalho Chehab
2512640c19dSMauro Carvalho Chehab
2522640c19dSMauro Carvalho ChehabTo stride over segment boundaries, this sequence of files may be split
2532640c19dSMauro Carvalho Chehabinto multiple logs.  The sequence of logs that should be treated as
2542640c19dSMauro Carvalho Chehablogically one log, is delimited with flags marked in the segment
2552640c19dSMauro Carvalho Chehabsummary.  The recovery code of nilfs2 looks this boundary information
2562640c19dSMauro Carvalho Chehabto ensure atomicity of updates.
2572640c19dSMauro Carvalho Chehab
2582640c19dSMauro Carvalho ChehabThe super root block is inserted for every checkpoints.  It includes
2592640c19dSMauro Carvalho Chehabthree special inodes, inodes for the DAT, cpfile, and sufile.  Inodes
2602640c19dSMauro Carvalho Chehabof regular files, directories, symlinks and other special files, are
2612640c19dSMauro Carvalho Chehabincluded in the ifile.  The inode of ifile itself is included in the
2622640c19dSMauro Carvalho Chehabcorresponding checkpoint entry in the cpfile.  Thus, the hierarchy
2632640c19dSMauro Carvalho Chehabamong NILFS2 files can be depicted as follows::
2642640c19dSMauro Carvalho Chehab
2652640c19dSMauro Carvalho Chehab  Super block (SB)
2662640c19dSMauro Carvalho Chehab       |
2672640c19dSMauro Carvalho Chehab       v
2682640c19dSMauro Carvalho Chehab  Super root block (the latest cno=xx)
2692640c19dSMauro Carvalho Chehab       |-- DAT
2702640c19dSMauro Carvalho Chehab       |-- sufile
2712640c19dSMauro Carvalho Chehab       `-- cpfile
2722640c19dSMauro Carvalho Chehab              |-- ifile (cno=c1)
2732640c19dSMauro Carvalho Chehab              |-- ifile (cno=c2) ---- file (ino=i1)
2742640c19dSMauro Carvalho Chehab              :        :          |-- file (ino=i2)
2752640c19dSMauro Carvalho Chehab              `-- ifile (cno=xx)  |-- file (ino=i3)
2762640c19dSMauro Carvalho Chehab                                  :        :
2772640c19dSMauro Carvalho Chehab                                  `-- file (ino=yy)
2782640c19dSMauro Carvalho Chehab                                    ( regular file, directory, or symlink )
2792640c19dSMauro Carvalho Chehab
2802640c19dSMauro Carvalho ChehabFor detail on the format of each file, please see nilfs2_ondisk.h
2812640c19dSMauro Carvalho Chehablocated at include/uapi/linux directory.
2822640c19dSMauro Carvalho Chehab
2832640c19dSMauro Carvalho ChehabThere are no patents or other intellectual property that we protect
2842640c19dSMauro Carvalho Chehabwith regard to the design of NILFS2.  It is allowed to replicate the
2852640c19dSMauro Carvalho Chehabdesign in hopes that other operating systems could share (mount, read,
2862640c19dSMauro Carvalho Chehabwrite, etc.) data stored in this format.
287