1*4c6f8a79SPeter XuDirty limit
2*4c6f8a79SPeter Xu===========
3*4c6f8a79SPeter Xu
4*4c6f8a79SPeter XuThe dirty limit, short for dirty page rate upper limit, is a new capability
5*4c6f8a79SPeter Xuintroduced in the 8.1 QEMU release that uses a new algorithm based on the KVM
6*4c6f8a79SPeter Xudirty ring to throttle down the guest during live migration.
7*4c6f8a79SPeter Xu
8*4c6f8a79SPeter XuThe algorithm framework is as follows:
9*4c6f8a79SPeter Xu
10*4c6f8a79SPeter Xu::
11*4c6f8a79SPeter Xu
12*4c6f8a79SPeter Xu  ------------------------------------------------------------------------------
13*4c6f8a79SPeter Xu  main   --------------> throttle thread ------------> PREPARE(1) <--------
14*4c6f8a79SPeter Xu  thread  \                                                |              |
15*4c6f8a79SPeter Xu           \                                               |              |
16*4c6f8a79SPeter Xu            \                                              V              |
17*4c6f8a79SPeter Xu             -\                                        CALCULATE(2)       |
18*4c6f8a79SPeter Xu               \                                           |              |
19*4c6f8a79SPeter Xu                \                                          |              |
20*4c6f8a79SPeter Xu                 \                                         V              |
21*4c6f8a79SPeter Xu                  \                                    SET PENALTY(3) -----
22*4c6f8a79SPeter Xu                   -\                                      |
23*4c6f8a79SPeter Xu                     \                                     |
24*4c6f8a79SPeter Xu                      \                                    V
25*4c6f8a79SPeter Xu                       -> virtual CPU thread -------> ACCEPT PENALTY(4)
26*4c6f8a79SPeter Xu  ------------------------------------------------------------------------------
27*4c6f8a79SPeter Xu
28*4c6f8a79SPeter XuWhen the qmp command qmp_set_vcpu_dirty_limit is called for the first time,
29*4c6f8a79SPeter Xuthe QEMU main thread starts the throttle thread. The throttle thread, once
30*4c6f8a79SPeter Xulaunched, executes the loop, which consists of three steps:
31*4c6f8a79SPeter Xu
32*4c6f8a79SPeter Xu  - PREPARE (1)
33*4c6f8a79SPeter Xu
34*4c6f8a79SPeter Xu     The entire work of PREPARE (1) is preparation for the second stage,
35*4c6f8a79SPeter Xu     CALCULATE(2), as the name implies. It involves preparing the dirty
36*4c6f8a79SPeter Xu     page rate value and the corresponding upper limit of the VM:
37*4c6f8a79SPeter Xu     The dirty page rate is calculated via the KVM dirty ring mechanism,
38*4c6f8a79SPeter Xu     which tells QEMU how many dirty pages a virtual CPU has had since the
39*4c6f8a79SPeter Xu     last KVM_EXIT_DIRTY_RING_FULL exception; The dirty page rate upper
40*4c6f8a79SPeter Xu     limit is specified by caller, therefore fetch it directly.
41*4c6f8a79SPeter Xu
42*4c6f8a79SPeter Xu  - CALCULATE (2)
43*4c6f8a79SPeter Xu
44*4c6f8a79SPeter Xu     Calculate a suitable sleep period for each virtual CPU, which will be
45*4c6f8a79SPeter Xu     used to determine the penalty for the target virtual CPU. The
46*4c6f8a79SPeter Xu     computation must be done carefully in order to reduce the dirty page
47*4c6f8a79SPeter Xu     rate progressively down to the upper limit without oscillation. To
48*4c6f8a79SPeter Xu     achieve this, two strategies are provided: the first is to add or
49*4c6f8a79SPeter Xu     subtract sleep time based on the ratio of the current dirty page rate
50*4c6f8a79SPeter Xu     to the limit, which is used when the current dirty page rate is far
51*4c6f8a79SPeter Xu     from the limit; the second is to add or subtract a fixed time when
52*4c6f8a79SPeter Xu     the current dirty page rate is close to the limit.
53*4c6f8a79SPeter Xu
54*4c6f8a79SPeter Xu  - SET PENALTY (3)
55*4c6f8a79SPeter Xu
56*4c6f8a79SPeter Xu     Set the sleep time for each virtual CPU that should be penalized based
57*4c6f8a79SPeter Xu     on the results of the calculation supplied by step CALCULATE (2).
58*4c6f8a79SPeter Xu
59*4c6f8a79SPeter XuAfter completing the three above stages, the throttle thread loops back
60*4c6f8a79SPeter Xuto step PREPARE (1) until the dirty limit is reached.
61*4c6f8a79SPeter Xu
62*4c6f8a79SPeter XuOn the other hand, each virtual CPU thread reads the sleep duration and
63*4c6f8a79SPeter Xusleeps in the path of the KVM_EXIT_DIRTY_RING_FULL exception handler, that
64*4c6f8a79SPeter Xuis ACCEPT PENALTY (4). Virtual CPUs tied with writing processes will
65*4c6f8a79SPeter Xuobviously exit to the path and get penalized, whereas virtual CPUs involved
66*4c6f8a79SPeter Xuwith read processes will not.
67*4c6f8a79SPeter Xu
68*4c6f8a79SPeter XuIn summary, thanks to the KVM dirty ring technology, the dirty limit
69*4c6f8a79SPeter Xualgorithm will restrict virtual CPUs as needed to keep their dirty page
70*4c6f8a79SPeter Xurate inside the limit. This leads to more steady reading performance during
71*4c6f8a79SPeter Xulive migration and can aid in improving large guest responsiveness.
72