xref: /openbmc/linux/Documentation/power/energy-model.rst (revision 151f4e2bdc7a04020ae5c533896fb91a16e1f501)
1*151f4e2bSMauro Carvalho Chehab====================
2*151f4e2bSMauro Carvalho ChehabEnergy Model of CPUs
3*151f4e2bSMauro Carvalho Chehab====================
4*151f4e2bSMauro Carvalho Chehab
5*151f4e2bSMauro Carvalho Chehab1. Overview
6*151f4e2bSMauro Carvalho Chehab-----------
7*151f4e2bSMauro Carvalho Chehab
8*151f4e2bSMauro Carvalho ChehabThe Energy Model (EM) framework serves as an interface between drivers knowing
9*151f4e2bSMauro Carvalho Chehabthe power consumed by CPUs at various performance levels, and the kernel
10*151f4e2bSMauro Carvalho Chehabsubsystems willing to use that information to make energy-aware decisions.
11*151f4e2bSMauro Carvalho Chehab
12*151f4e2bSMauro Carvalho ChehabThe source of the information about the power consumed by CPUs can vary greatly
13*151f4e2bSMauro Carvalho Chehabfrom one platform to another. These power costs can be estimated using
14*151f4e2bSMauro Carvalho Chehabdevicetree data in some cases. In others, the firmware will know better.
15*151f4e2bSMauro Carvalho ChehabAlternatively, userspace might be best positioned. And so on. In order to avoid
16*151f4e2bSMauro Carvalho Chehabeach and every client subsystem to re-implement support for each and every
17*151f4e2bSMauro Carvalho Chehabpossible source of information on its own, the EM framework intervenes as an
18*151f4e2bSMauro Carvalho Chehababstraction layer which standardizes the format of power cost tables in the
19*151f4e2bSMauro Carvalho Chehabkernel, hence enabling to avoid redundant work.
20*151f4e2bSMauro Carvalho Chehab
21*151f4e2bSMauro Carvalho ChehabThe figure below depicts an example of drivers (Arm-specific here, but the
22*151f4e2bSMauro Carvalho Chehabapproach is applicable to any architecture) providing power costs to the EM
23*151f4e2bSMauro Carvalho Chehabframework, and interested clients reading the data from it::
24*151f4e2bSMauro Carvalho Chehab
25*151f4e2bSMauro Carvalho Chehab       +---------------+  +-----------------+  +---------------+
26*151f4e2bSMauro Carvalho Chehab       | Thermal (IPA) |  | Scheduler (EAS) |  |     Other     |
27*151f4e2bSMauro Carvalho Chehab       +---------------+  +-----------------+  +---------------+
28*151f4e2bSMauro Carvalho Chehab               |                   | em_pd_energy()    |
29*151f4e2bSMauro Carvalho Chehab               |                   | em_cpu_get()      |
30*151f4e2bSMauro Carvalho Chehab               +---------+         |         +---------+
31*151f4e2bSMauro Carvalho Chehab                         |         |         |
32*151f4e2bSMauro Carvalho Chehab                         v         v         v
33*151f4e2bSMauro Carvalho Chehab                        +---------------------+
34*151f4e2bSMauro Carvalho Chehab                        |    Energy Model     |
35*151f4e2bSMauro Carvalho Chehab                        |     Framework       |
36*151f4e2bSMauro Carvalho Chehab                        +---------------------+
37*151f4e2bSMauro Carvalho Chehab                           ^       ^       ^
38*151f4e2bSMauro Carvalho Chehab                           |       |       | em_register_perf_domain()
39*151f4e2bSMauro Carvalho Chehab                +----------+       |       +---------+
40*151f4e2bSMauro Carvalho Chehab                |                  |                 |
41*151f4e2bSMauro Carvalho Chehab        +---------------+  +---------------+  +--------------+
42*151f4e2bSMauro Carvalho Chehab        |  cpufreq-dt   |  |   arm_scmi    |  |    Other     |
43*151f4e2bSMauro Carvalho Chehab        +---------------+  +---------------+  +--------------+
44*151f4e2bSMauro Carvalho Chehab                ^                  ^                 ^
45*151f4e2bSMauro Carvalho Chehab                |                  |                 |
46*151f4e2bSMauro Carvalho Chehab        +--------------+   +---------------+  +--------------+
47*151f4e2bSMauro Carvalho Chehab        | Device Tree  |   |   Firmware    |  |      ?       |
48*151f4e2bSMauro Carvalho Chehab        +--------------+   +---------------+  +--------------+
49*151f4e2bSMauro Carvalho Chehab
50*151f4e2bSMauro Carvalho ChehabThe EM framework manages power cost tables per 'performance domain' in the
51*151f4e2bSMauro Carvalho Chehabsystem. A performance domain is a group of CPUs whose performance is scaled
52*151f4e2bSMauro Carvalho Chehabtogether. Performance domains generally have a 1-to-1 mapping with CPUFreq
53*151f4e2bSMauro Carvalho Chehabpolicies. All CPUs in a performance domain are required to have the same
54*151f4e2bSMauro Carvalho Chehabmicro-architecture. CPUs in different performance domains can have different
55*151f4e2bSMauro Carvalho Chehabmicro-architectures.
56*151f4e2bSMauro Carvalho Chehab
57*151f4e2bSMauro Carvalho Chehab
58*151f4e2bSMauro Carvalho Chehab2. Core APIs
59*151f4e2bSMauro Carvalho Chehab------------
60*151f4e2bSMauro Carvalho Chehab
61*151f4e2bSMauro Carvalho Chehab2.1 Config options
62*151f4e2bSMauro Carvalho Chehab^^^^^^^^^^^^^^^^^^
63*151f4e2bSMauro Carvalho Chehab
64*151f4e2bSMauro Carvalho ChehabCONFIG_ENERGY_MODEL must be enabled to use the EM framework.
65*151f4e2bSMauro Carvalho Chehab
66*151f4e2bSMauro Carvalho Chehab
67*151f4e2bSMauro Carvalho Chehab2.2 Registration of performance domains
68*151f4e2bSMauro Carvalho Chehab^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
69*151f4e2bSMauro Carvalho Chehab
70*151f4e2bSMauro Carvalho ChehabDrivers are expected to register performance domains into the EM framework by
71*151f4e2bSMauro Carvalho Chehabcalling the following API::
72*151f4e2bSMauro Carvalho Chehab
73*151f4e2bSMauro Carvalho Chehab  int em_register_perf_domain(cpumask_t *span, unsigned int nr_states,
74*151f4e2bSMauro Carvalho Chehab			      struct em_data_callback *cb);
75*151f4e2bSMauro Carvalho Chehab
76*151f4e2bSMauro Carvalho ChehabDrivers must specify the CPUs of the performance domains using the cpumask
77*151f4e2bSMauro Carvalho Chehabargument, and provide a callback function returning <frequency, power> tuples
78*151f4e2bSMauro Carvalho Chehabfor each capacity state. The callback function provided by the driver is free
79*151f4e2bSMauro Carvalho Chehabto fetch data from any relevant location (DT, firmware, ...), and by any mean
80*151f4e2bSMauro Carvalho Chehabdeemed necessary. See Section 3. for an example of driver implementing this
81*151f4e2bSMauro Carvalho Chehabcallback, and kernel/power/energy_model.c for further documentation on this
82*151f4e2bSMauro Carvalho ChehabAPI.
83*151f4e2bSMauro Carvalho Chehab
84*151f4e2bSMauro Carvalho Chehab
85*151f4e2bSMauro Carvalho Chehab2.3 Accessing performance domains
86*151f4e2bSMauro Carvalho Chehab^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
87*151f4e2bSMauro Carvalho Chehab
88*151f4e2bSMauro Carvalho ChehabSubsystems interested in the energy model of a CPU can retrieve it using the
89*151f4e2bSMauro Carvalho Chehabem_cpu_get() API. The energy model tables are allocated once upon creation of
90*151f4e2bSMauro Carvalho Chehabthe performance domains, and kept in memory untouched.
91*151f4e2bSMauro Carvalho Chehab
92*151f4e2bSMauro Carvalho ChehabThe energy consumed by a performance domain can be estimated using the
93*151f4e2bSMauro Carvalho Chehabem_pd_energy() API. The estimation is performed assuming that the schedutil
94*151f4e2bSMauro Carvalho ChehabCPUfreq governor is in use.
95*151f4e2bSMauro Carvalho Chehab
96*151f4e2bSMauro Carvalho ChehabMore details about the above APIs can be found in include/linux/energy_model.h.
97*151f4e2bSMauro Carvalho Chehab
98*151f4e2bSMauro Carvalho Chehab
99*151f4e2bSMauro Carvalho Chehab3. Example driver
100*151f4e2bSMauro Carvalho Chehab-----------------
101*151f4e2bSMauro Carvalho Chehab
102*151f4e2bSMauro Carvalho ChehabThis section provides a simple example of a CPUFreq driver registering a
103*151f4e2bSMauro Carvalho Chehabperformance domain in the Energy Model framework using the (fake) 'foo'
104*151f4e2bSMauro Carvalho Chehabprotocol. The driver implements an est_power() function to be provided to the
105*151f4e2bSMauro Carvalho ChehabEM framework::
106*151f4e2bSMauro Carvalho Chehab
107*151f4e2bSMauro Carvalho Chehab  -> drivers/cpufreq/foo_cpufreq.c
108*151f4e2bSMauro Carvalho Chehab
109*151f4e2bSMauro Carvalho Chehab  01	static int est_power(unsigned long *mW, unsigned long *KHz, int cpu)
110*151f4e2bSMauro Carvalho Chehab  02	{
111*151f4e2bSMauro Carvalho Chehab  03		long freq, power;
112*151f4e2bSMauro Carvalho Chehab  04
113*151f4e2bSMauro Carvalho Chehab  05		/* Use the 'foo' protocol to ceil the frequency */
114*151f4e2bSMauro Carvalho Chehab  06		freq = foo_get_freq_ceil(cpu, *KHz);
115*151f4e2bSMauro Carvalho Chehab  07		if (freq < 0);
116*151f4e2bSMauro Carvalho Chehab  08			return freq;
117*151f4e2bSMauro Carvalho Chehab  09
118*151f4e2bSMauro Carvalho Chehab  10		/* Estimate the power cost for the CPU at the relevant freq. */
119*151f4e2bSMauro Carvalho Chehab  11		power = foo_estimate_power(cpu, freq);
120*151f4e2bSMauro Carvalho Chehab  12		if (power < 0);
121*151f4e2bSMauro Carvalho Chehab  13			return power;
122*151f4e2bSMauro Carvalho Chehab  14
123*151f4e2bSMauro Carvalho Chehab  15		/* Return the values to the EM framework */
124*151f4e2bSMauro Carvalho Chehab  16		*mW = power;
125*151f4e2bSMauro Carvalho Chehab  17		*KHz = freq;
126*151f4e2bSMauro Carvalho Chehab  18
127*151f4e2bSMauro Carvalho Chehab  19		return 0;
128*151f4e2bSMauro Carvalho Chehab  20	}
129*151f4e2bSMauro Carvalho Chehab  21
130*151f4e2bSMauro Carvalho Chehab  22	static int foo_cpufreq_init(struct cpufreq_policy *policy)
131*151f4e2bSMauro Carvalho Chehab  23	{
132*151f4e2bSMauro Carvalho Chehab  24		struct em_data_callback em_cb = EM_DATA_CB(est_power);
133*151f4e2bSMauro Carvalho Chehab  25		int nr_opp, ret;
134*151f4e2bSMauro Carvalho Chehab  26
135*151f4e2bSMauro Carvalho Chehab  27		/* Do the actual CPUFreq init work ... */
136*151f4e2bSMauro Carvalho Chehab  28		ret = do_foo_cpufreq_init(policy);
137*151f4e2bSMauro Carvalho Chehab  29		if (ret)
138*151f4e2bSMauro Carvalho Chehab  30			return ret;
139*151f4e2bSMauro Carvalho Chehab  31
140*151f4e2bSMauro Carvalho Chehab  32		/* Find the number of OPPs for this policy */
141*151f4e2bSMauro Carvalho Chehab  33		nr_opp = foo_get_nr_opp(policy);
142*151f4e2bSMauro Carvalho Chehab  34
143*151f4e2bSMauro Carvalho Chehab  35		/* And register the new performance domain */
144*151f4e2bSMauro Carvalho Chehab  36		em_register_perf_domain(policy->cpus, nr_opp, &em_cb);
145*151f4e2bSMauro Carvalho Chehab  37
146*151f4e2bSMauro Carvalho Chehab  38	        return 0;
147*151f4e2bSMauro Carvalho Chehab  39	}
148