1What: /sys/block/<disk>/alignment_offset 2Date: April 2009 3Contact: Martin K. Petersen <martin.petersen@oracle.com> 4Description: 5 Storage devices may report a physical block size that is 6 bigger than the logical block size (for instance a drive 7 with 4KB physical sectors exposing 512-byte logical 8 blocks to the operating system). This parameter 9 indicates how many bytes the beginning of the device is 10 offset from the disk's natural alignment. 11 12 13What: /sys/block/<disk>/discard_alignment 14Date: May 2011 15Contact: Martin K. Petersen <martin.petersen@oracle.com> 16Description: 17 Devices that support discard functionality may 18 internally allocate space in units that are bigger than 19 the exported logical block size. The discard_alignment 20 parameter indicates how many bytes the beginning of the 21 device is offset from the internal allocation unit's 22 natural alignment. 23 24 25What: /sys/block/<disk>/diskseq 26Date: February 2021 27Contact: Matteo Croce <mcroce@microsoft.com> 28Description: 29 The /sys/block/<disk>/diskseq files reports the disk 30 sequence number, which is a monotonically increasing 31 number assigned to every drive. 32 Some devices, like the loop device, refresh such number 33 every time the backing file is changed. 34 The value type is 64 bit unsigned. 35 36 37What: /sys/block/<disk>/inflight 38Date: October 2009 39Contact: Jens Axboe <axboe@kernel.dk>, Nikanth Karthikesan <knikanth@suse.de> 40Description: 41 Reports the number of I/O requests currently in progress 42 (pending / in flight) in a device driver. This can be less 43 than the number of requests queued in the block device queue. 44 The report contains 2 fields: one for read requests 45 and one for write requests. 46 The value type is unsigned int. 47 Cf. Documentation/block/stat.rst which contains a single value for 48 requests in flight. 49 This is related to /sys/block/<disk>/queue/nr_requests 50 and for SCSI device also its queue_depth. 51 52 53What: /sys/block/<disk>/integrity/device_is_integrity_capable 54Date: July 2014 55Contact: Martin K. Petersen <martin.petersen@oracle.com> 56Description: 57 Indicates whether a storage device is capable of storing 58 integrity metadata. Set if the device is T10 PI-capable. 59 60 61What: /sys/block/<disk>/integrity/format 62Date: June 2008 63Contact: Martin K. Petersen <martin.petersen@oracle.com> 64Description: 65 Metadata format for integrity capable block device. 66 E.g. T10-DIF-TYPE1-CRC. 67 68 69What: /sys/block/<disk>/integrity/protection_interval_bytes 70Date: July 2015 71Contact: Martin K. Petersen <martin.petersen@oracle.com> 72Description: 73 Describes the number of data bytes which are protected 74 by one integrity tuple. Typically the device's logical 75 block size. 76 77 78What: /sys/block/<disk>/integrity/read_verify 79Date: June 2008 80Contact: Martin K. Petersen <martin.petersen@oracle.com> 81Description: 82 Indicates whether the block layer should verify the 83 integrity of read requests serviced by devices that 84 support sending integrity metadata. 85 86 87What: /sys/block/<disk>/integrity/tag_size 88Date: June 2008 89Contact: Martin K. Petersen <martin.petersen@oracle.com> 90Description: 91 Number of bytes of integrity tag space available per 92 512 bytes of data. 93 94 95What: /sys/block/<disk>/integrity/write_generate 96Date: June 2008 97Contact: Martin K. Petersen <martin.petersen@oracle.com> 98Description: 99 Indicates whether the block layer should automatically 100 generate checksums for write requests bound for 101 devices that support receiving integrity metadata. 102 103 104What: /sys/block/<disk>/<partition>/alignment_offset 105Date: April 2009 106Contact: Martin K. Petersen <martin.petersen@oracle.com> 107Description: 108 Storage devices may report a physical block size that is 109 bigger than the logical block size (for instance a drive 110 with 4KB physical sectors exposing 512-byte logical 111 blocks to the operating system). This parameter 112 indicates how many bytes the beginning of the partition 113 is offset from the disk's natural alignment. 114 115 116What: /sys/block/<disk>/<partition>/discard_alignment 117Date: May 2011 118Contact: Martin K. Petersen <martin.petersen@oracle.com> 119Description: 120 Devices that support discard functionality may 121 internally allocate space in units that are bigger than 122 the exported logical block size. The discard_alignment 123 parameter indicates how many bytes the beginning of the 124 partition is offset from the internal allocation unit's 125 natural alignment. 126 127 128What: /sys/block/<disk>/<partition>/stat 129Date: February 2008 130Contact: Jerome Marchand <jmarchan@redhat.com> 131Description: 132 The /sys/block/<disk>/<partition>/stat files display the 133 I/O statistics of partition <partition>. The format is the 134 same as the format of /sys/block/<disk>/stat. 135 136 137What: /sys/block/<disk>/queue/add_random 138Date: June 2010 139Contact: linux-block@vger.kernel.org 140Description: 141 [RW] This file allows to turn off the disk entropy contribution. 142 Default value of this file is '1'(on). 143 144 145What: /sys/block/<disk>/queue/chunk_sectors 146Date: September 2016 147Contact: Hannes Reinecke <hare@suse.com> 148Description: 149 [RO] chunk_sectors has different meaning depending on the type 150 of the disk. For a RAID device (dm-raid), chunk_sectors 151 indicates the size in 512B sectors of the RAID volume stripe 152 segment. For a zoned block device, either host-aware or 153 host-managed, chunk_sectors indicates the size in 512B sectors 154 of the zones of the device, with the eventual exception of the 155 last zone of the device which may be smaller. 156 157 158What: /sys/block/<disk>/queue/dax 159Date: June 2016 160Contact: linux-block@vger.kernel.org 161Description: 162 [RO] This file indicates whether the device supports Direct 163 Access (DAX), used by CPU-addressable storage to bypass the 164 pagecache. It shows '1' if true, '0' if not. 165 166 167What: /sys/block/<disk>/queue/discard_granularity 168Date: May 2011 169Contact: Martin K. Petersen <martin.petersen@oracle.com> 170Description: 171 [RO] Devices that support discard functionality may internally 172 allocate space using units that are bigger than the logical 173 block size. The discard_granularity parameter indicates the size 174 of the internal allocation unit in bytes if reported by the 175 device. Otherwise the discard_granularity will be set to match 176 the device's physical block size. A discard_granularity of 0 177 means that the device does not support discard functionality. 178 179 180What: /sys/block/<disk>/queue/discard_max_bytes 181Date: May 2011 182Contact: Martin K. Petersen <martin.petersen@oracle.com> 183Description: 184 [RW] While discard_max_hw_bytes is the hardware limit for the 185 device, this setting is the software limit. Some devices exhibit 186 large latencies when large discards are issued, setting this 187 value lower will make Linux issue smaller discards and 188 potentially help reduce latencies induced by large discard 189 operations. 190 191 192What: /sys/block/<disk>/queue/discard_max_hw_bytes 193Date: July 2015 194Contact: linux-block@vger.kernel.org 195Description: 196 [RO] Devices that support discard functionality may have 197 internal limits on the number of bytes that can be trimmed or 198 unmapped in a single operation. The `discard_max_hw_bytes` 199 parameter is set by the device driver to the maximum number of 200 bytes that can be discarded in a single operation. Discard 201 requests issued to the device must not exceed this limit. A 202 `discard_max_hw_bytes` value of 0 means that the device does not 203 support discard functionality. 204 205 206What: /sys/block/<disk>/queue/discard_zeroes_data 207Date: May 2011 208Contact: Martin K. Petersen <martin.petersen@oracle.com> 209Description: 210 [RO] Will always return 0. Don't rely on any specific behavior 211 for discards, and don't read this file. 212 213 214What: /sys/block/<disk>/queue/fua 215Date: May 2018 216Contact: linux-block@vger.kernel.org 217Description: 218 [RO] Whether or not the block driver supports the FUA flag for 219 write requests. FUA stands for Force Unit Access. If the FUA 220 flag is set that means that write requests must bypass the 221 volatile cache of the storage device. 222 223 224What: /sys/block/<disk>/queue/hw_sector_size 225Date: January 2008 226Contact: linux-block@vger.kernel.org 227Description: 228 [RO] This is the hardware sector size of the device, in bytes. 229 230 231What: /sys/block/<disk>/queue/independent_access_ranges/ 232Date: October 2021 233Contact: linux-block@vger.kernel.org 234Description: 235 [RO] The presence of this sub-directory of the 236 /sys/block/xxx/queue/ directory indicates that the device is 237 capable of executing requests targeting different sector ranges 238 in parallel. For instance, single LUN multi-actuator hard-disks 239 will have an independent_access_ranges directory if the device 240 correctly advertizes the sector ranges of its actuators. 241 242 The independent_access_ranges directory contains one directory 243 per access range, with each range described using the sector 244 (RO) attribute file to indicate the first sector of the range 245 and the nr_sectors (RO) attribute file to indicate the total 246 number of sectors in the range starting from the first sector of 247 the range. For example, a dual-actuator hard-disk will have the 248 following independent_access_ranges entries.:: 249 250 $ tree /sys/block/<disk>/queue/independent_access_ranges/ 251 /sys/block/<disk>/queue/independent_access_ranges/ 252 |-- 0 253 | |-- nr_sectors 254 | `-- sector 255 `-- 1 256 |-- nr_sectors 257 `-- sector 258 259 The sector and nr_sectors attributes use 512B sector unit, 260 regardless of the actual block size of the device. Independent 261 access ranges do not overlap and include all sectors within the 262 device capacity. The access ranges are numbered in increasing 263 order of the range start sector, that is, the sector attribute 264 of range 0 always has the value 0. 265 266 267What: /sys/block/<disk>/queue/io_poll 268Date: November 2015 269Contact: linux-block@vger.kernel.org 270Description: 271 [RW] When read, this file shows whether polling is enabled (1) 272 or disabled (0). Writing '0' to this file will disable polling 273 for this device. Writing any non-zero value will enable this 274 feature. 275 276 277What: /sys/block/<disk>/queue/io_poll_delay 278Date: November 2016 279Contact: linux-block@vger.kernel.org 280Description: 281 [RW] If polling is enabled, this controls what kind of polling 282 will be performed. It defaults to -1, which is classic polling. 283 In this mode, the CPU will repeatedly ask for completions 284 without giving up any time. If set to 0, a hybrid polling mode 285 is used, where the kernel will attempt to make an educated guess 286 at when the IO will complete. Based on this guess, the kernel 287 will put the process issuing IO to sleep for an amount of time, 288 before entering a classic poll loop. This mode might be a little 289 slower than pure classic polling, but it will be more efficient. 290 If set to a value larger than 0, the kernel will put the process 291 issuing IO to sleep for this amount of microseconds before 292 entering classic polling. 293 294 295What: /sys/block/<disk>/queue/io_timeout 296Date: November 2018 297Contact: Weiping Zhang <zhangweiping@didiglobal.com> 298Description: 299 [RW] io_timeout is the request timeout in milliseconds. If a 300 request does not complete in this time then the block driver 301 timeout handler is invoked. That timeout handler can decide to 302 retry the request, to fail it or to start a device recovery 303 strategy. 304 305 306What: /sys/block/<disk>/queue/iostats 307Date: January 2009 308Contact: linux-block@vger.kernel.org 309Description: 310 [RW] This file is used to control (on/off) the iostats 311 accounting of the disk. 312 313 314What: /sys/block/<disk>/queue/logical_block_size 315Date: May 2009 316Contact: Martin K. Petersen <martin.petersen@oracle.com> 317Description: 318 [RO] This is the smallest unit the storage device can address. 319 It is typically 512 bytes. 320 321 322What: /sys/block/<disk>/queue/max_active_zones 323Date: July 2020 324Contact: Niklas Cassel <niklas.cassel@wdc.com> 325Description: 326 [RO] For zoned block devices (zoned attribute indicating 327 "host-managed" or "host-aware"), the sum of zones belonging to 328 any of the zone states: EXPLICIT OPEN, IMPLICIT OPEN or CLOSED, 329 is limited by this value. If this value is 0, there is no limit. 330 331 If the host attempts to exceed this limit, the driver should 332 report this error with BLK_STS_ZONE_ACTIVE_RESOURCE, which user 333 space may see as the EOVERFLOW errno. 334 335 336What: /sys/block/<disk>/queue/max_discard_segments 337Date: February 2017 338Contact: linux-block@vger.kernel.org 339Description: 340 [RO] The maximum number of DMA scatter/gather entries in a 341 discard request. 342 343 344What: /sys/block/<disk>/queue/max_hw_sectors_kb 345Date: September 2004 346Contact: linux-block@vger.kernel.org 347Description: 348 [RO] This is the maximum number of kilobytes supported in a 349 single data transfer. 350 351 352What: /sys/block/<disk>/queue/max_integrity_segments 353Date: September 2010 354Contact: linux-block@vger.kernel.org 355Description: 356 [RO] Maximum number of elements in a DMA scatter/gather list 357 with integrity data that will be submitted by the block layer 358 core to the associated block driver. 359 360 361What: /sys/block/<disk>/queue/max_open_zones 362Date: July 2020 363Contact: Niklas Cassel <niklas.cassel@wdc.com> 364Description: 365 [RO] For zoned block devices (zoned attribute indicating 366 "host-managed" or "host-aware"), the sum of zones belonging to 367 any of the zone states: EXPLICIT OPEN or IMPLICIT OPEN, is 368 limited by this value. If this value is 0, there is no limit. 369 370 371What: /sys/block/<disk>/queue/max_sectors_kb 372Date: September 2004 373Contact: linux-block@vger.kernel.org 374Description: 375 [RW] This is the maximum number of kilobytes that the block 376 layer will allow for a filesystem request. Must be smaller than 377 or equal to the maximum size allowed by the hardware. 378 379 380What: /sys/block/<disk>/queue/max_segment_size 381Date: March 2010 382Contact: linux-block@vger.kernel.org 383Description: 384 [RO] Maximum size in bytes of a single element in a DMA 385 scatter/gather list. 386 387 388What: /sys/block/<disk>/queue/max_segments 389Date: March 2010 390Contact: linux-block@vger.kernel.org 391Description: 392 [RO] Maximum number of elements in a DMA scatter/gather list 393 that is submitted to the associated block driver. 394 395 396What: /sys/block/<disk>/queue/minimum_io_size 397Date: April 2009 398Contact: Martin K. Petersen <martin.petersen@oracle.com> 399Description: 400 [RO] Storage devices may report a granularity or preferred 401 minimum I/O size which is the smallest request the device can 402 perform without incurring a performance penalty. For disk 403 drives this is often the physical block size. For RAID arrays 404 it is often the stripe chunk size. A properly aligned multiple 405 of minimum_io_size is the preferred request size for workloads 406 where a high number of I/O operations is desired. 407 408 409What: /sys/block/<disk>/queue/nomerges 410Date: January 2010 411Contact: linux-block@vger.kernel.org 412Description: 413 [RW] Standard I/O elevator operations include attempts to merge 414 contiguous I/Os. For known random I/O loads these attempts will 415 always fail and result in extra cycles being spent in the 416 kernel. This allows one to turn off this behavior on one of two 417 ways: When set to 1, complex merge checks are disabled, but the 418 simple one-shot merges with the previous I/O request are 419 enabled. When set to 2, all merge tries are disabled. The 420 default value is 0 - which enables all types of merge tries. 421 422 423What: /sys/block/<disk>/queue/nr_requests 424Date: July 2003 425Contact: linux-block@vger.kernel.org 426Description: 427 [RW] This controls how many requests may be allocated in the 428 block layer for read or write requests. Note that the total 429 allocated number may be twice this amount, since it applies only 430 to reads or writes (not the accumulated sum). 431 432 To avoid priority inversion through request starvation, a 433 request queue maintains a separate request pool per each cgroup 434 when CONFIG_BLK_CGROUP is enabled, and this parameter applies to 435 each such per-block-cgroup request pool. IOW, if there are N 436 block cgroups, each request queue may have up to N request 437 pools, each independently regulated by nr_requests. 438 439 440What: /sys/block/<disk>/queue/nr_zones 441Date: November 2018 442Contact: Damien Le Moal <damien.lemoal@wdc.com> 443Description: 444 [RO] nr_zones indicates the total number of zones of a zoned 445 block device ("host-aware" or "host-managed" zone model). For 446 regular block devices, the value is always 0. 447 448 449What: /sys/block/<disk>/queue/optimal_io_size 450Date: April 2009 451Contact: Martin K. Petersen <martin.petersen@oracle.com> 452Description: 453 [RO] Storage devices may report an optimal I/O size, which is 454 the device's preferred unit for sustained I/O. This is rarely 455 reported for disk drives. For RAID arrays it is usually the 456 stripe width or the internal track size. A properly aligned 457 multiple of optimal_io_size is the preferred request size for 458 workloads where sustained throughput is desired. If no optimal 459 I/O size is reported this file contains 0. 460 461 462What: /sys/block/<disk>/queue/physical_block_size 463Date: May 2009 464Contact: Martin K. Petersen <martin.petersen@oracle.com> 465Description: 466 [RO] This is the smallest unit a physical storage device can 467 write atomically. It is usually the same as the logical block 468 size but may be bigger. One example is SATA drives with 4KB 469 sectors that expose a 512-byte logical block size to the 470 operating system. For stacked block devices the 471 physical_block_size variable contains the maximum 472 physical_block_size of the component devices. 473 474 475What: /sys/block/<disk>/queue/read_ahead_kb 476Date: May 2004 477Contact: linux-block@vger.kernel.org 478Description: 479 [RW] Maximum number of kilobytes to read-ahead for filesystems 480 on this block device. 481 482 483What: /sys/block/<disk>/queue/rotational 484Date: January 2009 485Contact: linux-block@vger.kernel.org 486Description: 487 [RW] This file is used to stat if the device is of rotational 488 type or non-rotational type. 489 490 491What: /sys/block/<disk>/queue/rq_affinity 492Date: September 2008 493Contact: linux-block@vger.kernel.org 494Description: 495 [RW] If this option is '1', the block layer will migrate request 496 completions to the cpu "group" that originally submitted the 497 request. For some workloads this provides a significant 498 reduction in CPU cycles due to caching effects. 499 500 For storage configurations that need to maximize distribution of 501 completion processing setting this option to '2' forces the 502 completion to run on the requesting cpu (bypassing the "group" 503 aggregation logic). 504 505 506What: /sys/block/<disk>/queue/scheduler 507Date: October 2004 508Contact: linux-block@vger.kernel.org 509Description: 510 [RW] When read, this file will display the current and available 511 IO schedulers for this block device. The currently active IO 512 scheduler will be enclosed in [] brackets. Writing an IO 513 scheduler name to this file will switch control of this block 514 device to that new IO scheduler. Note that writing an IO 515 scheduler name to this file will attempt to load that IO 516 scheduler module, if it isn't already present in the system. 517 518 519What: /sys/block/<disk>/queue/stable_writes 520Date: September 2020 521Contact: linux-block@vger.kernel.org 522Description: 523 [RW] This file will contain '1' if memory must not be modified 524 while it is being used in a write request to this device. When 525 this is the case and the kernel is performing writeback of a 526 page, the kernel will wait for writeback to complete before 527 allowing the page to be modified again, rather than allowing 528 immediate modification as is normally the case. This 529 restriction arises when the device accesses the memory multiple 530 times where the same data must be seen every time -- for 531 example, once to calculate a checksum and once to actually write 532 the data. If no such restriction exists, this file will contain 533 '0'. This file is writable for testing purposes. 534 535 536What: /sys/block/<disk>/queue/throttle_sample_time 537Date: March 2017 538Contact: linux-block@vger.kernel.org 539Description: 540 [RW] This is the time window that blk-throttle samples data, in 541 millisecond. blk-throttle makes decision based on the 542 samplings. Lower time means cgroups have more smooth throughput, 543 but higher CPU overhead. This exists only when 544 CONFIG_BLK_DEV_THROTTLING_LOW is enabled. 545 546 547What: /sys/block/<disk>/queue/virt_boundary_mask 548Date: April 2021 549Contact: linux-block@vger.kernel.org 550Description: 551 [RO] This file shows the I/O segment memory alignment mask for 552 the block device. I/O requests to this device will be split 553 between segments wherever either the memory address of the end 554 of the previous segment or the memory address of the beginning 555 of the current segment is not aligned to virt_boundary_mask + 1 556 bytes. 557 558 559What: /sys/block/<disk>/queue/wbt_lat_usec 560Date: November 2016 561Contact: linux-block@vger.kernel.org 562Description: 563 [RW] If the device is registered for writeback throttling, then 564 this file shows the target minimum read latency. If this latency 565 is exceeded in a given window of time (see wb_window_usec), then 566 the writeback throttling will start scaling back writes. Writing 567 a value of '0' to this file disables the feature. Writing a 568 value of '-1' to this file resets the value to the default 569 setting. 570 571 572What: /sys/block/<disk>/queue/write_cache 573Date: April 2016 574Contact: linux-block@vger.kernel.org 575Description: 576 [RW] When read, this file will display whether the device has 577 write back caching enabled or not. It will return "write back" 578 for the former case, and "write through" for the latter. Writing 579 to this file can change the kernels view of the device, but it 580 doesn't alter the device state. This means that it might not be 581 safe to toggle the setting from "write back" to "write through", 582 since that will also eliminate cache flushes issued by the 583 kernel. 584 585 586What: /sys/block/<disk>/queue/write_same_max_bytes 587Date: January 2012 588Contact: Martin K. Petersen <martin.petersen@oracle.com> 589Description: 590 [RO] Some devices support a write same operation in which a 591 single data block can be written to a range of several 592 contiguous blocks on storage. This can be used to wipe areas on 593 disk or to initialize drives in a RAID configuration. 594 write_same_max_bytes indicates how many bytes can be written in 595 a single write same command. If write_same_max_bytes is 0, write 596 same is not supported by the device. 597 598 599What: /sys/block/<disk>/queue/write_zeroes_max_bytes 600Date: November 2016 601Contact: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> 602Description: 603 [RO] Devices that support write zeroes operation in which a 604 single request can be issued to zero out the range of contiguous 605 blocks on storage without having any payload in the request. 606 This can be used to optimize writing zeroes to the devices. 607 write_zeroes_max_bytes indicates how many bytes can be written 608 in a single write zeroes command. If write_zeroes_max_bytes is 609 0, write zeroes is not supported by the device. 610 611 612What: /sys/block/<disk>/queue/zone_append_max_bytes 613Date: May 2020 614Contact: linux-block@vger.kernel.org 615Description: 616 [RO] This is the maximum number of bytes that can be written to 617 a sequential zone of a zoned block device using a zone append 618 write operation (REQ_OP_ZONE_APPEND). This value is always 0 for 619 regular block devices. 620 621 622What: /sys/block/<disk>/queue/zone_write_granularity 623Date: January 2021 624Contact: linux-block@vger.kernel.org 625Description: 626 [RO] This indicates the alignment constraint, in bytes, for 627 write operations in sequential zones of zoned block devices 628 (devices with a zoned attributed that reports "host-managed" or 629 "host-aware"). This value is always 0 for regular block devices. 630 631 632What: /sys/block/<disk>/queue/zoned 633Date: September 2016 634Contact: Damien Le Moal <damien.lemoal@wdc.com> 635Description: 636 [RO] zoned indicates if the device is a zoned block device and 637 the zone model of the device if it is indeed zoned. The 638 possible values indicated by zoned are "none" for regular block 639 devices and "host-aware" or "host-managed" for zoned block 640 devices. The characteristics of host-aware and host-managed 641 zoned block devices are described in the ZBC (Zoned Block 642 Commands) and ZAC (Zoned Device ATA Command Set) standards. 643 These standards also define the "drive-managed" zone model. 644 However, since drive-managed zoned block devices do not support 645 zone commands, they will be treated as regular block devices and 646 zoned will report "none". 647 648 649What: /sys/block/<disk>/stat 650Date: February 2008 651Contact: Jerome Marchand <jmarchan@redhat.com> 652Description: 653 The /sys/block/<disk>/stat files displays the I/O 654 statistics of disk <disk>. They contain 11 fields: 655 656 == ============================================== 657 1 reads completed successfully 658 2 reads merged 659 3 sectors read 660 4 time spent reading (ms) 661 5 writes completed 662 6 writes merged 663 7 sectors written 664 8 time spent writing (ms) 665 9 I/Os currently in progress 666 10 time spent doing I/Os (ms) 667 11 weighted time spent doing I/Os (ms) 668 12 discards completed 669 13 discards merged 670 14 sectors discarded 671 15 time spent discarding (ms) 672 16 flush requests completed 673 17 time spent flushing (ms) 674 == ============================================== 675 676 For more details refer Documentation/admin-guide/iostats.rst 677