xref: /openbmc/linux/Documentation/admin-guide/device-mapper/dm-service-time.rst (revision 0898782247ae533d1f4e47a06bc5d4870931b284)
1*6cf2a73cSMauro Carvalho Chehab===============
2*6cf2a73cSMauro Carvalho Chehabdm-service-time
3*6cf2a73cSMauro Carvalho Chehab===============
4*6cf2a73cSMauro Carvalho Chehab
5*6cf2a73cSMauro Carvalho Chehabdm-service-time is a path selector module for device-mapper targets,
6*6cf2a73cSMauro Carvalho Chehabwhich selects a path with the shortest estimated service time for
7*6cf2a73cSMauro Carvalho Chehabthe incoming I/O.
8*6cf2a73cSMauro Carvalho Chehab
9*6cf2a73cSMauro Carvalho ChehabThe service time for each path is estimated by dividing the total size
10*6cf2a73cSMauro Carvalho Chehabof in-flight I/Os on a path with the performance value of the path.
11*6cf2a73cSMauro Carvalho ChehabThe performance value is a relative throughput value among all paths
12*6cf2a73cSMauro Carvalho Chehabin a path-group, and it can be specified as a table argument.
13*6cf2a73cSMauro Carvalho Chehab
14*6cf2a73cSMauro Carvalho ChehabThe path selector name is 'service-time'.
15*6cf2a73cSMauro Carvalho Chehab
16*6cf2a73cSMauro Carvalho ChehabTable parameters for each path:
17*6cf2a73cSMauro Carvalho Chehab
18*6cf2a73cSMauro Carvalho Chehab    [<repeat_count> [<relative_throughput>]]
19*6cf2a73cSMauro Carvalho Chehab	<repeat_count>:
20*6cf2a73cSMauro Carvalho Chehab			The number of I/Os to dispatch using the selected
21*6cf2a73cSMauro Carvalho Chehab			path before switching to the next path.
22*6cf2a73cSMauro Carvalho Chehab			If not given, internal default is used.  To check
23*6cf2a73cSMauro Carvalho Chehab			the default value, see the activated table.
24*6cf2a73cSMauro Carvalho Chehab	<relative_throughput>:
25*6cf2a73cSMauro Carvalho Chehab			The relative throughput value of the path
26*6cf2a73cSMauro Carvalho Chehab			among all paths in the path-group.
27*6cf2a73cSMauro Carvalho Chehab			The valid range is 0-100.
28*6cf2a73cSMauro Carvalho Chehab			If not given, minimum value '1' is used.
29*6cf2a73cSMauro Carvalho Chehab			If '0' is given, the path isn't selected while
30*6cf2a73cSMauro Carvalho Chehab			other paths having a positive value are available.
31*6cf2a73cSMauro Carvalho Chehab
32*6cf2a73cSMauro Carvalho ChehabStatus for each path:
33*6cf2a73cSMauro Carvalho Chehab
34*6cf2a73cSMauro Carvalho Chehab    <status> <fail-count> <in-flight-size> <relative_throughput>
35*6cf2a73cSMauro Carvalho Chehab	<status>:
36*6cf2a73cSMauro Carvalho Chehab		'A' if the path is active, 'F' if the path is failed.
37*6cf2a73cSMauro Carvalho Chehab	<fail-count>:
38*6cf2a73cSMauro Carvalho Chehab		The number of path failures.
39*6cf2a73cSMauro Carvalho Chehab	<in-flight-size>:
40*6cf2a73cSMauro Carvalho Chehab		The size of in-flight I/Os on the path.
41*6cf2a73cSMauro Carvalho Chehab	<relative_throughput>:
42*6cf2a73cSMauro Carvalho Chehab		The relative throughput value of the path
43*6cf2a73cSMauro Carvalho Chehab		among all paths in the path-group.
44*6cf2a73cSMauro Carvalho Chehab
45*6cf2a73cSMauro Carvalho Chehab
46*6cf2a73cSMauro Carvalho ChehabAlgorithm
47*6cf2a73cSMauro Carvalho Chehab=========
48*6cf2a73cSMauro Carvalho Chehab
49*6cf2a73cSMauro Carvalho Chehabdm-service-time adds the I/O size to 'in-flight-size' when the I/O is
50*6cf2a73cSMauro Carvalho Chehabdispatched and subtracts when completed.
51*6cf2a73cSMauro Carvalho ChehabBasically, dm-service-time selects a path having minimum service time
52*6cf2a73cSMauro Carvalho Chehabwhich is calculated by::
53*6cf2a73cSMauro Carvalho Chehab
54*6cf2a73cSMauro Carvalho Chehab	('in-flight-size' + 'size-of-incoming-io') / 'relative_throughput'
55*6cf2a73cSMauro Carvalho Chehab
56*6cf2a73cSMauro Carvalho ChehabHowever, some optimizations below are used to reduce the calculation
57*6cf2a73cSMauro Carvalho Chehabas much as possible.
58*6cf2a73cSMauro Carvalho Chehab
59*6cf2a73cSMauro Carvalho Chehab	1. If the paths have the same 'relative_throughput', skip
60*6cf2a73cSMauro Carvalho Chehab	   the division and just compare the 'in-flight-size'.
61*6cf2a73cSMauro Carvalho Chehab
62*6cf2a73cSMauro Carvalho Chehab	2. If the paths have the same 'in-flight-size', skip the division
63*6cf2a73cSMauro Carvalho Chehab	   and just compare the 'relative_throughput'.
64*6cf2a73cSMauro Carvalho Chehab
65*6cf2a73cSMauro Carvalho Chehab	3. If some paths have non-zero 'relative_throughput' and others
66*6cf2a73cSMauro Carvalho Chehab	   have zero 'relative_throughput', ignore those paths with zero
67*6cf2a73cSMauro Carvalho Chehab	   'relative_throughput'.
68*6cf2a73cSMauro Carvalho Chehab
69*6cf2a73cSMauro Carvalho ChehabIf such optimizations can't be applied, calculate service time, and
70*6cf2a73cSMauro Carvalho Chehabcompare service time.
71*6cf2a73cSMauro Carvalho ChehabIf calculated service time is equal, the path having maximum
72*6cf2a73cSMauro Carvalho Chehab'relative_throughput' may be better.  So compare 'relative_throughput'
73*6cf2a73cSMauro Carvalho Chehabthen.
74*6cf2a73cSMauro Carvalho Chehab
75*6cf2a73cSMauro Carvalho Chehab
76*6cf2a73cSMauro Carvalho ChehabExamples
77*6cf2a73cSMauro Carvalho Chehab========
78*6cf2a73cSMauro Carvalho ChehabIn case that 2 paths (sda and sdb) are used with repeat_count == 128
79*6cf2a73cSMauro Carvalho Chehaband sda has an average throughput 1GB/s and sdb has 4GB/s,
80*6cf2a73cSMauro Carvalho Chehab'relative_throughput' value may be '1' for sda and '4' for sdb::
81*6cf2a73cSMauro Carvalho Chehab
82*6cf2a73cSMauro Carvalho Chehab  # echo "0 10 multipath 0 0 1 1 service-time 0 2 2 8:0 128 1 8:16 128 4" \
83*6cf2a73cSMauro Carvalho Chehab    dmsetup create test
84*6cf2a73cSMauro Carvalho Chehab  #
85*6cf2a73cSMauro Carvalho Chehab  # dmsetup table
86*6cf2a73cSMauro Carvalho Chehab  test: 0 10 multipath 0 0 1 1 service-time 0 2 2 8:0 128 1 8:16 128 4
87*6cf2a73cSMauro Carvalho Chehab  #
88*6cf2a73cSMauro Carvalho Chehab  # dmsetup status
89*6cf2a73cSMauro Carvalho Chehab  test: 0 10 multipath 2 0 0 0 1 1 E 0 2 2 8:0 A 0 0 1 8:16 A 0 0 4
90*6cf2a73cSMauro Carvalho Chehab
91*6cf2a73cSMauro Carvalho Chehab
92*6cf2a73cSMauro Carvalho ChehabOr '2' for sda and '8' for sdb would be also true::
93*6cf2a73cSMauro Carvalho Chehab
94*6cf2a73cSMauro Carvalho Chehab  # echo "0 10 multipath 0 0 1 1 service-time 0 2 2 8:0 128 2 8:16 128 8" \
95*6cf2a73cSMauro Carvalho Chehab    dmsetup create test
96*6cf2a73cSMauro Carvalho Chehab  #
97*6cf2a73cSMauro Carvalho Chehab  # dmsetup table
98*6cf2a73cSMauro Carvalho Chehab  test: 0 10 multipath 0 0 1 1 service-time 0 2 2 8:0 128 2 8:16 128 8
99*6cf2a73cSMauro Carvalho Chehab  #
100*6cf2a73cSMauro Carvalho Chehab  # dmsetup status
101*6cf2a73cSMauro Carvalho Chehab  test: 0 10 multipath 2 0 0 0 1 1 E 0 2 2 8:0 A 0 0 2 8:16 A 0 0 8
102