xref: /openbmc/linux/Documentation/block/deadline-iosched.rst (revision 0898782247ae533d1f4e47a06bc5d4870931b284)
1*898bd37aSMauro Carvalho Chehab==============================
2*898bd37aSMauro Carvalho ChehabDeadline IO scheduler tunables
3*898bd37aSMauro Carvalho Chehab==============================
4*898bd37aSMauro Carvalho Chehab
5*898bd37aSMauro Carvalho ChehabThis little file attempts to document how the deadline io scheduler works.
6*898bd37aSMauro Carvalho ChehabIn particular, it will clarify the meaning of the exposed tunables that may be
7*898bd37aSMauro Carvalho Chehabof interest to power users.
8*898bd37aSMauro Carvalho Chehab
9*898bd37aSMauro Carvalho ChehabSelecting IO schedulers
10*898bd37aSMauro Carvalho Chehab-----------------------
11*898bd37aSMauro Carvalho ChehabRefer to Documentation/block/switching-sched.rst for information on
12*898bd37aSMauro Carvalho Chehabselecting an io scheduler on a per-device basis.
13*898bd37aSMauro Carvalho Chehab
14*898bd37aSMauro Carvalho Chehab------------------------------------------------------------------------------
15*898bd37aSMauro Carvalho Chehab
16*898bd37aSMauro Carvalho Chehabread_expire	(in ms)
17*898bd37aSMauro Carvalho Chehab-----------------------
18*898bd37aSMauro Carvalho Chehab
19*898bd37aSMauro Carvalho ChehabThe goal of the deadline io scheduler is to attempt to guarantee a start
20*898bd37aSMauro Carvalho Chehabservice time for a request. As we focus mainly on read latencies, this is
21*898bd37aSMauro Carvalho Chehabtunable. When a read request first enters the io scheduler, it is assigned
22*898bd37aSMauro Carvalho Chehaba deadline that is the current time + the read_expire value in units of
23*898bd37aSMauro Carvalho Chehabmilliseconds.
24*898bd37aSMauro Carvalho Chehab
25*898bd37aSMauro Carvalho Chehab
26*898bd37aSMauro Carvalho Chehabwrite_expire	(in ms)
27*898bd37aSMauro Carvalho Chehab-----------------------
28*898bd37aSMauro Carvalho Chehab
29*898bd37aSMauro Carvalho ChehabSimilar to read_expire mentioned above, but for writes.
30*898bd37aSMauro Carvalho Chehab
31*898bd37aSMauro Carvalho Chehab
32*898bd37aSMauro Carvalho Chehabfifo_batch	(number of requests)
33*898bd37aSMauro Carvalho Chehab------------------------------------
34*898bd37aSMauro Carvalho Chehab
35*898bd37aSMauro Carvalho ChehabRequests are grouped into ``batches`` of a particular data direction (read or
36*898bd37aSMauro Carvalho Chehabwrite) which are serviced in increasing sector order.  To limit extra seeking,
37*898bd37aSMauro Carvalho Chehabdeadline expiries are only checked between batches.  fifo_batch controls the
38*898bd37aSMauro Carvalho Chehabmaximum number of requests per batch.
39*898bd37aSMauro Carvalho Chehab
40*898bd37aSMauro Carvalho ChehabThis parameter tunes the balance between per-request latency and aggregate
41*898bd37aSMauro Carvalho Chehabthroughput.  When low latency is the primary concern, smaller is better (where
42*898bd37aSMauro Carvalho Chehaba value of 1 yields first-come first-served behaviour).  Increasing fifo_batch
43*898bd37aSMauro Carvalho Chehabgenerally improves throughput, at the cost of latency variation.
44*898bd37aSMauro Carvalho Chehab
45*898bd37aSMauro Carvalho Chehab
46*898bd37aSMauro Carvalho Chehabwrites_starved	(number of dispatches)
47*898bd37aSMauro Carvalho Chehab--------------------------------------
48*898bd37aSMauro Carvalho Chehab
49*898bd37aSMauro Carvalho ChehabWhen we have to move requests from the io scheduler queue to the block
50*898bd37aSMauro Carvalho Chehabdevice dispatch queue, we always give a preference to reads. However, we
51*898bd37aSMauro Carvalho Chehabdon't want to starve writes indefinitely either. So writes_starved controls
52*898bd37aSMauro Carvalho Chehabhow many times we give preference to reads over writes. When that has been
53*898bd37aSMauro Carvalho Chehabdone writes_starved number of times, we dispatch some writes based on the
54*898bd37aSMauro Carvalho Chehabsame criteria as reads.
55*898bd37aSMauro Carvalho Chehab
56*898bd37aSMauro Carvalho Chehab
57*898bd37aSMauro Carvalho Chehabfront_merges	(bool)
58*898bd37aSMauro Carvalho Chehab----------------------
59*898bd37aSMauro Carvalho Chehab
60*898bd37aSMauro Carvalho ChehabSometimes it happens that a request enters the io scheduler that is contiguous
61*898bd37aSMauro Carvalho Chehabwith a request that is already on the queue. Either it fits in the back of that
62*898bd37aSMauro Carvalho Chehabrequest, or it fits at the front. That is called either a back merge candidate
63*898bd37aSMauro Carvalho Chehabor a front merge candidate. Due to the way files are typically laid out,
64*898bd37aSMauro Carvalho Chehabback merges are much more common than front merges. For some work loads, you
65*898bd37aSMauro Carvalho Chehabmay even know that it is a waste of time to spend any time attempting to
66*898bd37aSMauro Carvalho Chehabfront merge requests. Setting front_merges to 0 disables this functionality.
67*898bd37aSMauro Carvalho ChehabFront merges may still occur due to the cached last_merge hint, but since
68*898bd37aSMauro Carvalho Chehabthat comes at basically 0 cost we leave that on. We simply disable the
69*898bd37aSMauro Carvalho Chehabrbtree front sector lookup when the io scheduler merge function is called.
70*898bd37aSMauro Carvalho Chehab
71*898bd37aSMauro Carvalho Chehab
72*898bd37aSMauro Carvalho ChehabNov 11 2002, Jens Axboe <jens.axboe@oracle.com>
73