1*151f4e2bSMauro Carvalho Chehab==================== 2*151f4e2bSMauro Carvalho ChehabEnergy Model of CPUs 3*151f4e2bSMauro Carvalho Chehab==================== 4*151f4e2bSMauro Carvalho Chehab 5*151f4e2bSMauro Carvalho Chehab1. Overview 6*151f4e2bSMauro Carvalho Chehab----------- 7*151f4e2bSMauro Carvalho Chehab 8*151f4e2bSMauro Carvalho ChehabThe Energy Model (EM) framework serves as an interface between drivers knowing 9*151f4e2bSMauro Carvalho Chehabthe power consumed by CPUs at various performance levels, and the kernel 10*151f4e2bSMauro Carvalho Chehabsubsystems willing to use that information to make energy-aware decisions. 11*151f4e2bSMauro Carvalho Chehab 12*151f4e2bSMauro Carvalho ChehabThe source of the information about the power consumed by CPUs can vary greatly 13*151f4e2bSMauro Carvalho Chehabfrom one platform to another. These power costs can be estimated using 14*151f4e2bSMauro Carvalho Chehabdevicetree data in some cases. In others, the firmware will know better. 15*151f4e2bSMauro Carvalho ChehabAlternatively, userspace might be best positioned. And so on. In order to avoid 16*151f4e2bSMauro Carvalho Chehabeach and every client subsystem to re-implement support for each and every 17*151f4e2bSMauro Carvalho Chehabpossible source of information on its own, the EM framework intervenes as an 18*151f4e2bSMauro Carvalho Chehababstraction layer which standardizes the format of power cost tables in the 19*151f4e2bSMauro Carvalho Chehabkernel, hence enabling to avoid redundant work. 20*151f4e2bSMauro Carvalho Chehab 21*151f4e2bSMauro Carvalho ChehabThe figure below depicts an example of drivers (Arm-specific here, but the 22*151f4e2bSMauro Carvalho Chehabapproach is applicable to any architecture) providing power costs to the EM 23*151f4e2bSMauro Carvalho Chehabframework, and interested clients reading the data from it:: 24*151f4e2bSMauro Carvalho Chehab 25*151f4e2bSMauro Carvalho Chehab +---------------+ +-----------------+ +---------------+ 26*151f4e2bSMauro Carvalho Chehab | Thermal (IPA) | | Scheduler (EAS) | | Other | 27*151f4e2bSMauro Carvalho Chehab +---------------+ +-----------------+ +---------------+ 28*151f4e2bSMauro Carvalho Chehab | | em_pd_energy() | 29*151f4e2bSMauro Carvalho Chehab | | em_cpu_get() | 30*151f4e2bSMauro Carvalho Chehab +---------+ | +---------+ 31*151f4e2bSMauro Carvalho Chehab | | | 32*151f4e2bSMauro Carvalho Chehab v v v 33*151f4e2bSMauro Carvalho Chehab +---------------------+ 34*151f4e2bSMauro Carvalho Chehab | Energy Model | 35*151f4e2bSMauro Carvalho Chehab | Framework | 36*151f4e2bSMauro Carvalho Chehab +---------------------+ 37*151f4e2bSMauro Carvalho Chehab ^ ^ ^ 38*151f4e2bSMauro Carvalho Chehab | | | em_register_perf_domain() 39*151f4e2bSMauro Carvalho Chehab +----------+ | +---------+ 40*151f4e2bSMauro Carvalho Chehab | | | 41*151f4e2bSMauro Carvalho Chehab +---------------+ +---------------+ +--------------+ 42*151f4e2bSMauro Carvalho Chehab | cpufreq-dt | | arm_scmi | | Other | 43*151f4e2bSMauro Carvalho Chehab +---------------+ +---------------+ +--------------+ 44*151f4e2bSMauro Carvalho Chehab ^ ^ ^ 45*151f4e2bSMauro Carvalho Chehab | | | 46*151f4e2bSMauro Carvalho Chehab +--------------+ +---------------+ +--------------+ 47*151f4e2bSMauro Carvalho Chehab | Device Tree | | Firmware | | ? | 48*151f4e2bSMauro Carvalho Chehab +--------------+ +---------------+ +--------------+ 49*151f4e2bSMauro Carvalho Chehab 50*151f4e2bSMauro Carvalho ChehabThe EM framework manages power cost tables per 'performance domain' in the 51*151f4e2bSMauro Carvalho Chehabsystem. A performance domain is a group of CPUs whose performance is scaled 52*151f4e2bSMauro Carvalho Chehabtogether. Performance domains generally have a 1-to-1 mapping with CPUFreq 53*151f4e2bSMauro Carvalho Chehabpolicies. All CPUs in a performance domain are required to have the same 54*151f4e2bSMauro Carvalho Chehabmicro-architecture. CPUs in different performance domains can have different 55*151f4e2bSMauro Carvalho Chehabmicro-architectures. 56*151f4e2bSMauro Carvalho Chehab 57*151f4e2bSMauro Carvalho Chehab 58*151f4e2bSMauro Carvalho Chehab2. Core APIs 59*151f4e2bSMauro Carvalho Chehab------------ 60*151f4e2bSMauro Carvalho Chehab 61*151f4e2bSMauro Carvalho Chehab2.1 Config options 62*151f4e2bSMauro Carvalho Chehab^^^^^^^^^^^^^^^^^^ 63*151f4e2bSMauro Carvalho Chehab 64*151f4e2bSMauro Carvalho ChehabCONFIG_ENERGY_MODEL must be enabled to use the EM framework. 65*151f4e2bSMauro Carvalho Chehab 66*151f4e2bSMauro Carvalho Chehab 67*151f4e2bSMauro Carvalho Chehab2.2 Registration of performance domains 68*151f4e2bSMauro Carvalho Chehab^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 69*151f4e2bSMauro Carvalho Chehab 70*151f4e2bSMauro Carvalho ChehabDrivers are expected to register performance domains into the EM framework by 71*151f4e2bSMauro Carvalho Chehabcalling the following API:: 72*151f4e2bSMauro Carvalho Chehab 73*151f4e2bSMauro Carvalho Chehab int em_register_perf_domain(cpumask_t *span, unsigned int nr_states, 74*151f4e2bSMauro Carvalho Chehab struct em_data_callback *cb); 75*151f4e2bSMauro Carvalho Chehab 76*151f4e2bSMauro Carvalho ChehabDrivers must specify the CPUs of the performance domains using the cpumask 77*151f4e2bSMauro Carvalho Chehabargument, and provide a callback function returning <frequency, power> tuples 78*151f4e2bSMauro Carvalho Chehabfor each capacity state. The callback function provided by the driver is free 79*151f4e2bSMauro Carvalho Chehabto fetch data from any relevant location (DT, firmware, ...), and by any mean 80*151f4e2bSMauro Carvalho Chehabdeemed necessary. See Section 3. for an example of driver implementing this 81*151f4e2bSMauro Carvalho Chehabcallback, and kernel/power/energy_model.c for further documentation on this 82*151f4e2bSMauro Carvalho ChehabAPI. 83*151f4e2bSMauro Carvalho Chehab 84*151f4e2bSMauro Carvalho Chehab 85*151f4e2bSMauro Carvalho Chehab2.3 Accessing performance domains 86*151f4e2bSMauro Carvalho Chehab^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 87*151f4e2bSMauro Carvalho Chehab 88*151f4e2bSMauro Carvalho ChehabSubsystems interested in the energy model of a CPU can retrieve it using the 89*151f4e2bSMauro Carvalho Chehabem_cpu_get() API. The energy model tables are allocated once upon creation of 90*151f4e2bSMauro Carvalho Chehabthe performance domains, and kept in memory untouched. 91*151f4e2bSMauro Carvalho Chehab 92*151f4e2bSMauro Carvalho ChehabThe energy consumed by a performance domain can be estimated using the 93*151f4e2bSMauro Carvalho Chehabem_pd_energy() API. The estimation is performed assuming that the schedutil 94*151f4e2bSMauro Carvalho ChehabCPUfreq governor is in use. 95*151f4e2bSMauro Carvalho Chehab 96*151f4e2bSMauro Carvalho ChehabMore details about the above APIs can be found in include/linux/energy_model.h. 97*151f4e2bSMauro Carvalho Chehab 98*151f4e2bSMauro Carvalho Chehab 99*151f4e2bSMauro Carvalho Chehab3. Example driver 100*151f4e2bSMauro Carvalho Chehab----------------- 101*151f4e2bSMauro Carvalho Chehab 102*151f4e2bSMauro Carvalho ChehabThis section provides a simple example of a CPUFreq driver registering a 103*151f4e2bSMauro Carvalho Chehabperformance domain in the Energy Model framework using the (fake) 'foo' 104*151f4e2bSMauro Carvalho Chehabprotocol. The driver implements an est_power() function to be provided to the 105*151f4e2bSMauro Carvalho ChehabEM framework:: 106*151f4e2bSMauro Carvalho Chehab 107*151f4e2bSMauro Carvalho Chehab -> drivers/cpufreq/foo_cpufreq.c 108*151f4e2bSMauro Carvalho Chehab 109*151f4e2bSMauro Carvalho Chehab 01 static int est_power(unsigned long *mW, unsigned long *KHz, int cpu) 110*151f4e2bSMauro Carvalho Chehab 02 { 111*151f4e2bSMauro Carvalho Chehab 03 long freq, power; 112*151f4e2bSMauro Carvalho Chehab 04 113*151f4e2bSMauro Carvalho Chehab 05 /* Use the 'foo' protocol to ceil the frequency */ 114*151f4e2bSMauro Carvalho Chehab 06 freq = foo_get_freq_ceil(cpu, *KHz); 115*151f4e2bSMauro Carvalho Chehab 07 if (freq < 0); 116*151f4e2bSMauro Carvalho Chehab 08 return freq; 117*151f4e2bSMauro Carvalho Chehab 09 118*151f4e2bSMauro Carvalho Chehab 10 /* Estimate the power cost for the CPU at the relevant freq. */ 119*151f4e2bSMauro Carvalho Chehab 11 power = foo_estimate_power(cpu, freq); 120*151f4e2bSMauro Carvalho Chehab 12 if (power < 0); 121*151f4e2bSMauro Carvalho Chehab 13 return power; 122*151f4e2bSMauro Carvalho Chehab 14 123*151f4e2bSMauro Carvalho Chehab 15 /* Return the values to the EM framework */ 124*151f4e2bSMauro Carvalho Chehab 16 *mW = power; 125*151f4e2bSMauro Carvalho Chehab 17 *KHz = freq; 126*151f4e2bSMauro Carvalho Chehab 18 127*151f4e2bSMauro Carvalho Chehab 19 return 0; 128*151f4e2bSMauro Carvalho Chehab 20 } 129*151f4e2bSMauro Carvalho Chehab 21 130*151f4e2bSMauro Carvalho Chehab 22 static int foo_cpufreq_init(struct cpufreq_policy *policy) 131*151f4e2bSMauro Carvalho Chehab 23 { 132*151f4e2bSMauro Carvalho Chehab 24 struct em_data_callback em_cb = EM_DATA_CB(est_power); 133*151f4e2bSMauro Carvalho Chehab 25 int nr_opp, ret; 134*151f4e2bSMauro Carvalho Chehab 26 135*151f4e2bSMauro Carvalho Chehab 27 /* Do the actual CPUFreq init work ... */ 136*151f4e2bSMauro Carvalho Chehab 28 ret = do_foo_cpufreq_init(policy); 137*151f4e2bSMauro Carvalho Chehab 29 if (ret) 138*151f4e2bSMauro Carvalho Chehab 30 return ret; 139*151f4e2bSMauro Carvalho Chehab 31 140*151f4e2bSMauro Carvalho Chehab 32 /* Find the number of OPPs for this policy */ 141*151f4e2bSMauro Carvalho Chehab 33 nr_opp = foo_get_nr_opp(policy); 142*151f4e2bSMauro Carvalho Chehab 34 143*151f4e2bSMauro Carvalho Chehab 35 /* And register the new performance domain */ 144*151f4e2bSMauro Carvalho Chehab 36 em_register_perf_domain(policy->cpus, nr_opp, &em_cb); 145*151f4e2bSMauro Carvalho Chehab 37 146*151f4e2bSMauro Carvalho Chehab 38 return 0; 147*151f4e2bSMauro Carvalho Chehab 39 } 148