18a98ec7cSDarrick J. Wong.. SPDX-License-Identifier: GPL-2.0 28a98ec7cSDarrick J. Wong 38a98ec7cSDarrick J. WongBigalloc 48a98ec7cSDarrick J. Wong-------- 58a98ec7cSDarrick J. Wong 68a98ec7cSDarrick J. WongAt the moment, the default size of a block is 4KiB, which is a commonly 78a98ec7cSDarrick J. Wongsupported page size on most MMU-capable hardware. This is fortunate, as 88a98ec7cSDarrick J. Wongext4 code is not prepared to handle the case where the block size 98a98ec7cSDarrick J. Wongexceeds the page size. However, for a filesystem of mostly huge files, 108a98ec7cSDarrick J. Wongit is desirable to be able to allocate disk blocks in units of multiple 118a98ec7cSDarrick J. Wongblocks to reduce both fragmentation and metadata overhead. The 12e8552640SAyush Ranjanbigalloc feature provides exactly this ability. 13e8552640SAyush Ranjan 14e8552640SAyush RanjanThe bigalloc feature (EXT4_FEATURE_RO_COMPAT_BIGALLOC) changes ext4 to 15e8552640SAyush Ranjanuse clustered allocation, so that each bit in the ext4 block allocation 16e8552640SAyush Ranjanbitmap addresses a power of two number of blocks. For example, if the 17e8552640SAyush Ranjanfile system is mainly going to be storing large files in the 4-32 18e8552640SAyush Ranjanmegabyte range, it might make sense to set a cluster size of 1 megabyte. 19e8552640SAyush RanjanThis means that each bit in the block allocation bitmap now addresses 20e8552640SAyush Ranjan256 4k blocks. This shrinks the total size of the block allocation 21e8552640SAyush Ranjanbitmaps for a 2T file system from 64 megabytes to 256 kilobytes. It also 22e8552640SAyush Ranjanmeans that a block group addresses 32 gigabytes instead of 128 megabytes, 23e8552640SAyush Ranjanalso shrinking the amount of file system overhead for metadata. 24e8552640SAyush Ranjan 25e8552640SAyush RanjanThe administrator can set a block cluster size at mkfs time (which is 26*3103084aSWang Jianjianstored in the s_log_cluster_size field in the superblock); from then 27e8552640SAyush Ranjanon, the block bitmaps track clusters, not individual blocks. This means 28e8552640SAyush Ranjanthat block groups can be several gigabytes in size (instead of just 29e8552640SAyush Ranjan128MiB); however, the minimum allocation unit becomes a cluster, not a 30e8552640SAyush Ranjanblock, even for directories. TaoBao had a patchset to extend the “use 31e8552640SAyush Ranjanunits of clusters instead of blocks” to the extent tree, though it is 32e8552640SAyush Ranjannot clear where those patches went-- they eventually morphed into 33e8552640SAyush Ranjan“extent tree v2” but that code has not landed as of May 2015. 348a98ec7cSDarrick J. Wong 35