Documentation/scheduler/sched-energy.rst

6 ---------------
10 Energy Model (EM) of the CPUs to select an energy efficient CPU for each task,
17    /!\ EAS does not support platforms with symmetric CPU topologies /!\
19 EAS operates only on heterogeneous CPU topologies (such as Arm big.LITTLE)
25 please refer to its documentation (see Documentation/power/energy-model.rst).
29 -----------------------------
32  - energy = [joule] (resource like a battery on powered devices)
33  - power = energy/time = [joule/second] = [watt]
39 	--------------------
45 	-----------
49 optimization objective to the current performance-only objective for the
50 scheduler. This alternative considers two objectives: energy-efficiency and
54 implications of its decisions rather than blindly applying energy-saving
60 for the scheduler to decide where a task should run (during wake-up), the EM
61 is used to break the tie between several good CPU candidates and pick the one
64 knowledge about the platform's topology, which include the 'capacity' of CPUs,
69 -----------------------
71 EAS (as well as the rest of the scheduler) uses the notion of 'capacity' to
72 differentiate CPUs with different computing throughput. The 'capacity' of a CPU
74 frequency compared to the most capable CPU of the system. Capacity values are
76 tasks and CPUs computed by the Per-Entity Load Tracking (PELT) mechanism. Thanks
77 to capacity and utilization values, EAS is able to estimate how big/busy a
78 task/CPU is, and to take this into consideration when evaluating performance vs
79 energy trade-offs. The capacity of CPUs is provided via arch-specific code
84 per 'performance domain' in the system (see Documentation/power/energy-model.rst
88 scheduling domains are built, or re-built. For each root domain (rd), the
90 the current rd->span. Each node in the list contains a pointer to a struct
103 	          PDs:   |--pd0--|--pd4--|---pd8---|
104 	          RDs:   |----rd1----|-----rd2-----|
110     present in the linked list '->pd' attached to each of them:
112        * rd1->pd: pd0 -> pd4
113        * rd2->pd: pd4 -> pd8
128 4. Energy-Aware task placement
129 ------------------------------
131 EAS overrides the CFS task wake-up balancing code. It uses the EM of the
132 platform and the PELT signals to choose an energy-efficient target CPU during
133 wake-up balance. When EAS is enabled, select_task_rq_fair() calls
135 for the CPU with the highest spare capacity (CPU capacity - CPU utilization) in
138 save energy compared to leaving it on prev_cpu, i.e. the CPU where the task ran
148 An example of energy-optimized task placement decision is detailed below.
159     below. CPUs 0-3 have a util_avg of 400, 100, 600 and 500 respectively
161     The CPU capacity and power cost associated with each OPP is listed in
165      CPU util.
166       1024                 - - - - - - -              Energy Model
167                                                +-----------+-------------+
169        768                 =============       +-----+-----+------+------+
171                                                +-----+-----+------+------+
172        512  ===========    - ##- - - - -       | 170 | 50  | 512  | 400  |
174        341  -PP - - - -      ##     ##         | 512 | 300 | 1024 | 1700 |
175              PP              ##     ##         +-----+-----+------+------+
176        170  -## - - - -      ##     ##
178            ------------    -------------
181       Current OPP: =====       Other OPP: - - -     util_avg (100 each): ##
185     maximum spare capacity in the two performance domains. In this example,
194       1024                 - - - - - - -
200        512  - - - - - -    - ##- - - - -     * CPU3: 500 / 768 * 800 = 520
204        170  -## - - PP-      ##     ##
206            ------------    -------------
212       1024                 - - - - - - -
218        512  - - - - - -    - ##- - -PP -     * CPU3: 700 / 768 * 800 = 729
222        170  -## - - - -      ##     ##
224            ------------    -------------
228     **Case 3. P stays on prev_cpu / CPU 0**::
230       1024                 - - - - - - -
236        512  ===========    - ##- - - - -     * CPU3: 500 / 768 * 800 = 520
238        341  -PP - - - -      ##     ##
240        170  -## - - - -      ##     ##
242            ------------    -------------
246     From these calculations, the Case 1 has the lowest total energy. So CPU 1
247     is be the best candidate from an energy-efficiency standpoint.
251 necessarily more energy-efficient than big CPUs. For some systems, the high OPPs
252 of the little CPUs can be less energy-efficient than the lowest OPPs of the
258 And even in the case where all OPPs of the big CPUs are less energy-efficient
260 specific conditions, save energy. Indeed, placing a task on a little CPU can
263 placed on a big CPU, its own execution cost might be higher than if it was
272 CPUs of the system. Thanks to its EM-based design, EAS should cope with them
274 impact on throughput for high-utilization scenarios, EAS also implements another
275 mechanism called 'over-utilization'.
278 5. Over-utilization
279 -------------------
281 From a general standpoint, the use-cases where EAS can help the most are those
282 involving a light/medium CPU utilization. Whenever long CPU-bound tasks are
283 being run, they will require all of the available CPU capacity, and there isn't
286 'over-utilized' as soon as they are used at more than 80% of their compute
287 capacity. As long as no CPUs are over-utilized in a root domain, load balancing
288 is disabled and EAS overridess the wake-up balancing code. EAS is likely to load
290 done without harming throughput. So, the load-balancer is disabled to prevent
291 it from breaking the energy-efficient task placement found by EAS. It is safe to
298     b. all tasks should already be provided with enough CPU capacity,
300     c. since there is spare capacity all tasks must be blocking/sleeping
301        regularly and balancing at wake-up is sufficient.
303 As soon as one CPU goes above the 80% tipping point, at least one of the three
305 is raised for the entire root domain, EAS is disabled, and the load-balancer is
306 re-enabled. By doing so, the scheduler falls back onto load-based algorithms for
307 wake-up and load balance under CPU-bound conditions. This provides a better
311 there is some idle time in the system, the CPU capacity 'stolen' by higher
313 such, the detection of overutilization accounts for the capacity used not only
318 ----------------------------------------
325 6.1 - Asymmetric CPU topology
330 asymmetric CPU topologies for now. This requirement is checked at run-time by
334 See Documentation/scheduler/sched-capacity.rst for requirements to be met for this
342 6.2 - Energy Model presence
348 independent EM framework in Documentation/power/energy-model.rst.
350 Please also note that the scheduling domains need to be re-built after the
356 in milli-Watts or in an 'abstract scale'.
359 6.3 - Energy Model complexity
362 The task wake-up path is very latency-sensitive. When the EM of a platform is
364 states, ...), the cost of using it in the wake-up path can become prohibitive.
365 The energy-aware wake-up algorithm has a complexity of:
386     2. submit patches to reduce the complexity of the EAS wake-up algorithm,
390 6.4 - Schedutil governor
408 6.5 Scale-invariant utilization signals
412 states, EAS needs frequency-invariant and CPU-invariant PELT signals. These can
413 be obtained using the architecture-defined arch_scale{cpu,freq}_capacity()
425 CPUs, which can actually be counter-productive for both performance and energy.