1898bd37aSMauro Carvalho Chehab========================================== 2898bd37aSMauro Carvalho ChehabExplicit volatile write back cache control 3898bd37aSMauro Carvalho Chehab========================================== 4898bd37aSMauro Carvalho Chehab 5898bd37aSMauro Carvalho ChehabIntroduction 6898bd37aSMauro Carvalho Chehab------------ 7898bd37aSMauro Carvalho Chehab 8898bd37aSMauro Carvalho ChehabMany storage devices, especially in the consumer market, come with volatile 9898bd37aSMauro Carvalho Chehabwrite back caches. That means the devices signal I/O completion to the 10898bd37aSMauro Carvalho Chehaboperating system before data actually has hit the non-volatile storage. This 11898bd37aSMauro Carvalho Chehabbehavior obviously speeds up various workloads, but it means the operating 12898bd37aSMauro Carvalho Chehabsystem needs to force data out to the non-volatile storage when it performs 13898bd37aSMauro Carvalho Chehaba data integrity operation like fsync, sync or an unmount. 14898bd37aSMauro Carvalho Chehab 15898bd37aSMauro Carvalho ChehabThe Linux block layer provides two simple mechanisms that let filesystems 16898bd37aSMauro Carvalho Chehabcontrol the caching behavior of the storage device. These mechanisms are 17898bd37aSMauro Carvalho Chehaba forced cache flush, and the Force Unit Access (FUA) flag for requests. 18898bd37aSMauro Carvalho Chehab 19898bd37aSMauro Carvalho Chehab 20898bd37aSMauro Carvalho ChehabExplicit cache flushes 21898bd37aSMauro Carvalho Chehab---------------------- 22898bd37aSMauro Carvalho Chehab 23898bd37aSMauro Carvalho ChehabThe REQ_PREFLUSH flag can be OR ed into the r/w flags of a bio submitted from 24898bd37aSMauro Carvalho Chehabthe filesystem and will make sure the volatile cache of the storage device 25898bd37aSMauro Carvalho Chehabhas been flushed before the actual I/O operation is started. This explicitly 26898bd37aSMauro Carvalho Chehabguarantees that previously completed write requests are on non-volatile 27898bd37aSMauro Carvalho Chehabstorage before the flagged bio starts. In addition the REQ_PREFLUSH flag can be 28898bd37aSMauro Carvalho Chehabset on an otherwise empty bio structure, which causes only an explicit cache 29898bd37aSMauro Carvalho Chehabflush without any dependent I/O. It is recommend to use 30898bd37aSMauro Carvalho Chehabthe blkdev_issue_flush() helper for a pure cache flush. 31898bd37aSMauro Carvalho Chehab 32898bd37aSMauro Carvalho Chehab 33898bd37aSMauro Carvalho ChehabForced Unit Access 34898bd37aSMauro Carvalho Chehab------------------ 35898bd37aSMauro Carvalho Chehab 36898bd37aSMauro Carvalho ChehabThe REQ_FUA flag can be OR ed into the r/w flags of a bio submitted from the 37898bd37aSMauro Carvalho Chehabfilesystem and will make sure that I/O completion for this request is only 38898bd37aSMauro Carvalho Chehabsignaled after the data has been committed to non-volatile storage. 39898bd37aSMauro Carvalho Chehab 40898bd37aSMauro Carvalho Chehab 41898bd37aSMauro Carvalho ChehabImplementation details for filesystems 42898bd37aSMauro Carvalho Chehab-------------------------------------- 43898bd37aSMauro Carvalho Chehab 44898bd37aSMauro Carvalho ChehabFilesystems can simply set the REQ_PREFLUSH and REQ_FUA bits and do not have to 45898bd37aSMauro Carvalho Chehabworry if the underlying devices need any explicit cache flushing and how 46898bd37aSMauro Carvalho Chehabthe Forced Unit Access is implemented. The REQ_PREFLUSH and REQ_FUA flags 47898bd37aSMauro Carvalho Chehabmay both be set on a single bio. 48898bd37aSMauro Carvalho Chehab 49898bd37aSMauro Carvalho Chehab 50c62b37d9SChristoph HellwigImplementation details for bio based block drivers 51898bd37aSMauro Carvalho Chehab-------------------------------------------------------------- 52898bd37aSMauro Carvalho Chehab 53898bd37aSMauro Carvalho ChehabThese drivers will always see the REQ_PREFLUSH and REQ_FUA bits as they sit 54898bd37aSMauro Carvalho Chehabdirectly below the submit_bio interface. For remapping drivers the REQ_FUA 55898bd37aSMauro Carvalho Chehabbits need to be propagated to underlying devices, and a global flush needs 56898bd37aSMauro Carvalho Chehabto be implemented for bios with the REQ_PREFLUSH bit set. For real device 57898bd37aSMauro Carvalho Chehabdrivers that do not have a volatile cache the REQ_PREFLUSH and REQ_FUA bits 58898bd37aSMauro Carvalho Chehabon non-empty bios can simply be ignored, and REQ_PREFLUSH requests without 59898bd37aSMauro Carvalho Chehabdata can be completed successfully without doing any work. Drivers for 60898bd37aSMauro Carvalho Chehabdevices with volatile caches need to implement the support for these 61898bd37aSMauro Carvalho Chehabflags themselves without any help from the block layer. 62898bd37aSMauro Carvalho Chehab 63898bd37aSMauro Carvalho Chehab 64898bd37aSMauro Carvalho ChehabImplementation details for request_fn based block drivers 65898bd37aSMauro Carvalho Chehab--------------------------------------------------------- 66898bd37aSMauro Carvalho Chehab 67898bd37aSMauro Carvalho ChehabFor devices that do not support volatile write caches there is no driver 68898bd37aSMauro Carvalho Chehabsupport required, the block layer completes empty REQ_PREFLUSH requests before 69898bd37aSMauro Carvalho Chehabentering the driver and strips off the REQ_PREFLUSH and REQ_FUA bits from 70898bd37aSMauro Carvalho Chehabrequests that have a payload. For devices with volatile write caches the 71898bd37aSMauro Carvalho Chehabdriver needs to tell the block layer that it supports flushing caches by 72898bd37aSMauro Carvalho Chehabdoing:: 73898bd37aSMauro Carvalho Chehab 74898bd37aSMauro Carvalho Chehab blk_queue_write_cache(sdkp->disk->queue, true, false); 75898bd37aSMauro Carvalho Chehab 76898bd37aSMauro Carvalho Chehaband handle empty REQ_OP_FLUSH requests in its prep_fn/request_fn. Note that 77898bd37aSMauro Carvalho ChehabREQ_PREFLUSH requests with a payload are automatically turned into a sequence 78898bd37aSMauro Carvalho Chehabof an empty REQ_OP_FLUSH request followed by the actual write by the block 79898bd37aSMauro Carvalho Chehablayer. For devices that also support the FUA bit the block layer needs 80898bd37aSMauro Carvalho Chehabto be told to pass through the REQ_FUA bit using:: 81898bd37aSMauro Carvalho Chehab 82898bd37aSMauro Carvalho Chehab blk_queue_write_cache(sdkp->disk->queue, true, true); 83898bd37aSMauro Carvalho Chehab 84898bd37aSMauro Carvalho Chehaband the driver must handle write requests that have the REQ_FUA bit set 85898bd37aSMauro Carvalho Chehabin prep_fn/request_fn. If the FUA bit is not natively supported the block 86898bd37aSMauro Carvalho Chehablayer turns it into an empty REQ_OP_FLUSH request after the actual write. 87