1*4c6f8a79SPeter XuDirty limit 2*4c6f8a79SPeter Xu=========== 3*4c6f8a79SPeter Xu 4*4c6f8a79SPeter XuThe dirty limit, short for dirty page rate upper limit, is a new capability 5*4c6f8a79SPeter Xuintroduced in the 8.1 QEMU release that uses a new algorithm based on the KVM 6*4c6f8a79SPeter Xudirty ring to throttle down the guest during live migration. 7*4c6f8a79SPeter Xu 8*4c6f8a79SPeter XuThe algorithm framework is as follows: 9*4c6f8a79SPeter Xu 10*4c6f8a79SPeter Xu:: 11*4c6f8a79SPeter Xu 12*4c6f8a79SPeter Xu ------------------------------------------------------------------------------ 13*4c6f8a79SPeter Xu main --------------> throttle thread ------------> PREPARE(1) <-------- 14*4c6f8a79SPeter Xu thread \ | | 15*4c6f8a79SPeter Xu \ | | 16*4c6f8a79SPeter Xu \ V | 17*4c6f8a79SPeter Xu -\ CALCULATE(2) | 18*4c6f8a79SPeter Xu \ | | 19*4c6f8a79SPeter Xu \ | | 20*4c6f8a79SPeter Xu \ V | 21*4c6f8a79SPeter Xu \ SET PENALTY(3) ----- 22*4c6f8a79SPeter Xu -\ | 23*4c6f8a79SPeter Xu \ | 24*4c6f8a79SPeter Xu \ V 25*4c6f8a79SPeter Xu -> virtual CPU thread -------> ACCEPT PENALTY(4) 26*4c6f8a79SPeter Xu ------------------------------------------------------------------------------ 27*4c6f8a79SPeter Xu 28*4c6f8a79SPeter XuWhen the qmp command qmp_set_vcpu_dirty_limit is called for the first time, 29*4c6f8a79SPeter Xuthe QEMU main thread starts the throttle thread. The throttle thread, once 30*4c6f8a79SPeter Xulaunched, executes the loop, which consists of three steps: 31*4c6f8a79SPeter Xu 32*4c6f8a79SPeter Xu - PREPARE (1) 33*4c6f8a79SPeter Xu 34*4c6f8a79SPeter Xu The entire work of PREPARE (1) is preparation for the second stage, 35*4c6f8a79SPeter Xu CALCULATE(2), as the name implies. It involves preparing the dirty 36*4c6f8a79SPeter Xu page rate value and the corresponding upper limit of the VM: 37*4c6f8a79SPeter Xu The dirty page rate is calculated via the KVM dirty ring mechanism, 38*4c6f8a79SPeter Xu which tells QEMU how many dirty pages a virtual CPU has had since the 39*4c6f8a79SPeter Xu last KVM_EXIT_DIRTY_RING_FULL exception; The dirty page rate upper 40*4c6f8a79SPeter Xu limit is specified by caller, therefore fetch it directly. 41*4c6f8a79SPeter Xu 42*4c6f8a79SPeter Xu - CALCULATE (2) 43*4c6f8a79SPeter Xu 44*4c6f8a79SPeter Xu Calculate a suitable sleep period for each virtual CPU, which will be 45*4c6f8a79SPeter Xu used to determine the penalty for the target virtual CPU. The 46*4c6f8a79SPeter Xu computation must be done carefully in order to reduce the dirty page 47*4c6f8a79SPeter Xu rate progressively down to the upper limit without oscillation. To 48*4c6f8a79SPeter Xu achieve this, two strategies are provided: the first is to add or 49*4c6f8a79SPeter Xu subtract sleep time based on the ratio of the current dirty page rate 50*4c6f8a79SPeter Xu to the limit, which is used when the current dirty page rate is far 51*4c6f8a79SPeter Xu from the limit; the second is to add or subtract a fixed time when 52*4c6f8a79SPeter Xu the current dirty page rate is close to the limit. 53*4c6f8a79SPeter Xu 54*4c6f8a79SPeter Xu - SET PENALTY (3) 55*4c6f8a79SPeter Xu 56*4c6f8a79SPeter Xu Set the sleep time for each virtual CPU that should be penalized based 57*4c6f8a79SPeter Xu on the results of the calculation supplied by step CALCULATE (2). 58*4c6f8a79SPeter Xu 59*4c6f8a79SPeter XuAfter completing the three above stages, the throttle thread loops back 60*4c6f8a79SPeter Xuto step PREPARE (1) until the dirty limit is reached. 61*4c6f8a79SPeter Xu 62*4c6f8a79SPeter XuOn the other hand, each virtual CPU thread reads the sleep duration and 63*4c6f8a79SPeter Xusleeps in the path of the KVM_EXIT_DIRTY_RING_FULL exception handler, that 64*4c6f8a79SPeter Xuis ACCEPT PENALTY (4). Virtual CPUs tied with writing processes will 65*4c6f8a79SPeter Xuobviously exit to the path and get penalized, whereas virtual CPUs involved 66*4c6f8a79SPeter Xuwith read processes will not. 67*4c6f8a79SPeter Xu 68*4c6f8a79SPeter XuIn summary, thanks to the KVM dirty ring technology, the dirty limit 69*4c6f8a79SPeter Xualgorithm will restrict virtual CPUs as needed to keep their dirty page 70*4c6f8a79SPeter Xurate inside the limit. This leads to more steady reading performance during 71*4c6f8a79SPeter Xulive migration and can aid in improving large guest responsiveness. 72