1 The QEMU throttling infrastructure 2 ================================== 3 Copyright (C) 2016,2020 Igalia, S.L. 4 Author: Alberto Garcia <berto@igalia.com> 5 6 This work is licensed under the terms of the GNU GPL, version 2 or 7 later. See the COPYING file in the top-level directory. 8 9 Introduction 10 ------------ 11 QEMU includes a throttling module that can be used to set limits to 12 I/O operations. The code itself is generic and independent of the I/O 13 units, but it is currently used to limit the number of bytes per second 14 and operations per second (IOPS) when performing disk I/O. 15 16 This document explains how to use the throttling code in QEMU, and how 17 it works internally. The implementation is in throttle.c. 18 19 20 Using throttling to limit disk I/O 21 ---------------------------------- 22 Two aspects of the disk I/O can be limited: the number of bytes per 23 second and the number of operations per second (IOPS). For each one of 24 them the user can set a global limit or separate limits for read and 25 write operations. This gives us a total of six different parameters. 26 27 I/O limits can be set using the throttling.* parameters of -drive, or 28 using the QMP 'block_set_io_throttle' command. These are the names of 29 the parameters for both cases: 30 31 |-----------------------+-----------------------| 32 | -drive | block_set_io_throttle | 33 |-----------------------+-----------------------| 34 | throttling.iops-total | iops | 35 | throttling.iops-read | iops_rd | 36 | throttling.iops-write | iops_wr | 37 | throttling.bps-total | bps | 38 | throttling.bps-read | bps_rd | 39 | throttling.bps-write | bps_wr | 40 |-----------------------+-----------------------| 41 42 It is possible to set limits for both IOPS and bps at the same time, 43 and for each case we can decide whether to have separate read and 44 write limits or not, but note that if iops-total is set then neither 45 iops-read nor iops-write can be set. The same applies to bps-total and 46 bps-read/write. 47 48 The default value of these parameters is 0, and it means 'unlimited'. 49 50 In its most basic usage, the user can add a drive to QEMU with a limit 51 of 100 IOPS with the following -drive line: 52 53 -drive file=hd0.qcow2,throttling.iops-total=100 54 55 We can do the same using QMP. In this case all these parameters are 56 mandatory, so we must set to 0 the ones that we don't want to limit: 57 58 { "execute": "block_set_io_throttle", 59 "arguments": { 60 "device": "virtio0", 61 "iops": 100, 62 "iops_rd": 0, 63 "iops_wr": 0, 64 "bps": 0, 65 "bps_rd": 0, 66 "bps_wr": 0 67 } 68 } 69 70 71 I/O bursts 72 ---------- 73 In addition to the basic limits we have just seen, QEMU allows the 74 user to do bursts of I/O for a configurable amount of time. A burst is 75 an amount of I/O that can exceed the basic limit. Bursts are useful to 76 allow better performance when there are peaks of activity (the OS 77 boots, a service needs to be restarted) while keeping the average 78 limits lower the rest of the time. 79 80 Two parameters control bursts: their length and the maximum amount of 81 I/O they allow. These two can be configured separately for each one of 82 the six basic parameters described in the previous section, but in 83 this section we'll use 'iops-total' as an example. 84 85 The I/O limit during bursts is set using 'iops-total-max', and the 86 maximum length (in seconds) is set with 'iops-total-max-length'. So if 87 we want to configure a drive with a basic limit of 100 IOPS and allow 88 bursts of 2000 IOPS for 60 seconds, we would do it like this (the line 89 is split for clarity): 90 91 -drive file=hd0.qcow2, 92 throttling.iops-total=100, 93 throttling.iops-total-max=2000, 94 throttling.iops-total-max-length=60 95 96 Or, with QMP: 97 98 { "execute": "block_set_io_throttle", 99 "arguments": { 100 "device": "virtio0", 101 "iops": 100, 102 "iops_rd": 0, 103 "iops_wr": 0, 104 "bps": 0, 105 "bps_rd": 0, 106 "bps_wr": 0, 107 "iops_max": 2000, 108 "iops_max_length": 60, 109 } 110 } 111 112 With this, the user can perform I/O on hd0.qcow2 at a rate of 2000 113 IOPS for 1 minute before it's throttled down to 100 IOPS. 114 115 The user will be able to do bursts again if there's a sufficiently 116 long period of time with unused I/O (see below for details). 117 118 The default value for 'iops-total-max' is 0 and it means that bursts 119 are not allowed. 'iops-total-max-length' can only be set if 120 'iops-total-max' is set as well, and its default value is 1 second. 121 122 Here's the complete list of parameters for configuring bursts: 123 124 |----------------------------------+-----------------------| 125 | -drive | block_set_io_throttle | 126 |----------------------------------+-----------------------| 127 | throttling.iops-total-max | iops_max | 128 | throttling.iops-total-max-length | iops_max_length | 129 | throttling.iops-read-max | iops_rd_max | 130 | throttling.iops-read-max-length | iops_rd_max_length | 131 | throttling.iops-write-max | iops_wr_max | 132 | throttling.iops-write-max-length | iops_wr_max_length | 133 | throttling.bps-total-max | bps_max | 134 | throttling.bps-total-max-length | bps_max_length | 135 | throttling.bps-read-max | bps_rd_max | 136 | throttling.bps-read-max-length | bps_rd_max_length | 137 | throttling.bps-write-max | bps_wr_max | 138 | throttling.bps-write-max-length | bps_wr_max_length | 139 |----------------------------------+-----------------------| 140 141 142 Controlling the size of I/O operations 143 -------------------------------------- 144 When applying IOPS limits all I/O operations are treated equally 145 regardless of their size. This means that the user can take advantage 146 of this in order to circumvent the limits and submit one huge I/O 147 request instead of several smaller ones. 148 149 QEMU provides a setting called throttling.iops-size to prevent this 150 from happening. This setting specifies the size (in bytes) of an I/O 151 request for accounting purposes. Larger requests will be counted 152 proportionally to this size. 153 154 For example, if iops-size is set to 4096 then an 8KB request will be 155 counted as two, and a 6KB request will be counted as one and a 156 half. This only applies to requests larger than iops-size: smaller 157 requests will be always counted as one, no matter their size. 158 159 The default value of iops-size is 0 and it means that the size of the 160 requests is never taken into account when applying IOPS limits. 161 162 163 Applying I/O limits to groups of disks 164 -------------------------------------- 165 In all the examples so far we have seen how to apply limits to the I/O 166 performed on individual drives, but QEMU allows grouping drives so 167 they all share the same limits. 168 169 The way it works is that each drive with I/O limits is assigned to a 170 group named using the throttling.group parameter. If this parameter is 171 not specified, then the device name (i.e. 'virtio0', 'ide0-hd0') will 172 be used as the group name. 173 174 Limits set using the throttling.* parameters discussed earlier in this 175 document apply to the combined I/O of all members of a group. 176 177 Consider this example: 178 179 -drive file=hd1.qcow2,throttling.iops-total=6000,throttling.group=foo 180 -drive file=hd2.qcow2,throttling.iops-total=6000,throttling.group=foo 181 -drive file=hd3.qcow2,throttling.iops-total=3000,throttling.group=bar 182 -drive file=hd4.qcow2,throttling.iops-total=6000,throttling.group=foo 183 -drive file=hd5.qcow2,throttling.iops-total=3000,throttling.group=bar 184 -drive file=hd6.qcow2,throttling.iops-total=5000 185 186 Here hd1, hd2 and hd4 are all members of a group named 'foo' with a 187 combined IOPS limit of 6000, and hd3 and hd5 are members of 'bar'. hd6 188 is left alone (technically it is part of a 1-member group). 189 190 Limits are applied in a round-robin fashion so if there are concurrent 191 I/O requests on several drives of the same group they will be 192 distributed evenly. 193 194 When I/O limits are applied to an existing drive using the QMP command 195 'block_set_io_throttle', the following things need to be taken into 196 account: 197 198 - I/O limits are shared within the same group, so new values will 199 affect all members and overwrite the previous settings. In other 200 words: if different limits are applied to members of the same 201 group, the last one wins. 202 203 - If 'group' is unset it is assumed to be the current group of that 204 drive. If the drive is not in a group yet, it will be added to a 205 group named after the device name. 206 207 - If 'group' is set then the drive will be moved to that group if 208 it was member of a different one. In this case the limits 209 specified in the parameters will be applied to the new group 210 only. 211 212 - I/O limits can be disabled by setting all of them to 0. In this 213 case the device will be removed from its group and the rest of 214 its members will not be affected. The 'group' parameter is 215 ignored. 216 217 218 The Leaky Bucket algorithm 219 -------------------------- 220 I/O limits in QEMU are implemented using the leaky bucket algorithm 221 (specifically the "Leaky bucket as a meter" variant). 222 223 This algorithm uses the analogy of a bucket that leaks water 224 constantly. The water that gets into the bucket represents the I/O 225 that has been performed, and no more I/O is allowed once the bucket is 226 full. 227 228 To see the way this corresponds to the throttling parameters in QEMU, 229 consider the following values: 230 231 iops-total=100 232 iops-total-max=2000 233 iops-total-max-length=60 234 235 - Water leaks from the bucket at a rate of 100 IOPS. 236 - Water can be added to the bucket at a rate of 2000 IOPS. 237 - The size of the bucket is 2000 x 60 = 120000 238 - If 'iops-total-max-length' is unset then it defaults to 1 and the 239 size of the bucket is 2000. 240 - If 'iops-total-max' is unset then 'iops-total-max-length' must be 241 unset as well. In this case the bucket size is 100. 242 243 The bucket is initially empty, therefore water can be added until it's 244 full at a rate of 2000 IOPS (the burst rate). Once the bucket is full 245 we can only add as much water as it leaks, therefore the I/O rate is 246 reduced to 100 IOPS. If we add less water than it leaks then the 247 bucket will start to empty, allowing for bursts again. 248 249 Note that since water is leaking from the bucket even during bursts, 250 it will take a bit more than 60 seconds at 2000 IOPS to fill it 251 up. After those 60 seconds the bucket will have leaked 60 x 100 = 252 6000, allowing for 3 more seconds of I/O at 2000 IOPS. 253 254 Also, due to the way the algorithm works, longer burst can be done at 255 a lower I/O rate, e.g. 1000 IOPS during 120 seconds. 256 257 258 The 'throttle' block filter 259 --------------------------- 260 Since QEMU 2.11 it is possible to configure the I/O limits using a 261 'throttle' block filter. This filter uses the exact same throttling 262 infrastructure described above but can be used anywhere in the node 263 graph, allowing for more flexibility. 264 265 The user can create an arbitrary number of filters and each one of 266 them must be assigned to a group that contains the actual I/O limits. 267 Different filters can use the same group so the limits are shared as 268 described earlier in "Applying I/O limits to groups of disks". 269 270 A group can be created using the object-add QMP function: 271 272 { "execute": "object-add", 273 "arguments": { 274 "qom-type": "throttle-group", 275 "id": "group0", 276 "props": { 277 "limits" : { 278 "iops-total": 1000 279 "bps-write": 2097152 280 } 281 } 282 } 283 } 284 285 throttle-group has a 'limits' property (of type ThrottleLimits as 286 defined in qapi/block-core.json) which can be set on creation or later 287 with 'qom-set'. 288 289 A throttle-group can also be created with the -object command line 290 option but at the moment there is no way to pass a 'limits' parameter 291 that contains a ThrottleLimits structure. The solution is to set the 292 individual values directly, like in this example: 293 294 -object throttle-group,id=group0,x-iops-total=1000,x-bps-write=2097152 295 296 Note however that this is not a stable API (hence the 'x-' prefixes) and 297 will disappear when -object gains support for structured options and 298 enables use of 'limits'. 299 300 Once we have a throttle-group we can use the throttle block filter, 301 where the 'file' property must be set to the block device that we want 302 to filter: 303 304 { "execute": "blockdev-add", 305 "arguments": { 306 "options": { 307 "driver": "qcow2", 308 "node-name": "disk0", 309 "file": { 310 "driver": "file", 311 "filename": "/path/to/disk.qcow2" 312 } 313 } 314 } 315 } 316 317 { "execute": "blockdev-add", 318 "arguments": { 319 "driver": "throttle", 320 "node-name": "throttle0", 321 "throttle-group": "group0", 322 "file": "disk0" 323 } 324 } 325 326 A similar setup can also be done with the command line, for example: 327 328 -drive driver=throttle,throttle-group=group0, 329 file.driver=qcow2,file.file.filename=/path/to/disk.qcow2 330 331 The scenario described so far is very simple but the throttle block 332 filter allows for more complex configurations. For example, let's say 333 that we have three different drives and we want to set I/O limits for 334 each one of them and an additional set of limits for the combined I/O 335 of all three drives. 336 337 First we would define all throttle groups, one for each one of the 338 drives and one that would apply to all of them: 339 340 -object throttle-group,id=limits0,x-iops-total=2000 341 -object throttle-group,id=limits1,x-iops-total=2500 342 -object throttle-group,id=limits2,x-iops-total=3000 343 -object throttle-group,id=limits012,x-iops-total=4000 344 345 Now we can define the drives, and for each one of them we use two 346 chained throttle filters: the drive's own filter and the combined 347 filter. 348 349 -drive driver=throttle,throttle-group=limits012, 350 file.driver=throttle,file.throttle-group=limits0 351 file.file.driver=qcow2,file.file.file.filename=/path/to/disk0.qcow2 352 -drive driver=throttle,throttle-group=limits012, 353 file.driver=throttle,file.throttle-group=limits1 354 file.file.driver=qcow2,file.file.file.filename=/path/to/disk1.qcow2 355 -drive driver=throttle,throttle-group=limits012, 356 file.driver=throttle,file.throttle-group=limits2 357 file.file.driver=qcow2,file.file.file.filename=/path/to/disk2.qcow2 358 359 In this example the individual drives have IOPS limits of 2000, 2500 360 and 3000 respectively but the total combined I/O can never exceed 4000 361 IOPS. 362