xref: /openbmc/linux/Documentation/cpu-freq/cpu-drivers.rst (revision 8f92058987a1e3ec851218420ac1ef554aca41fe)
1*8f920589SMauro Carvalho Chehab.. SPDX-License-Identifier: GPL-2.0
2*8f920589SMauro Carvalho Chehab
3*8f920589SMauro Carvalho Chehab===============================================
4*8f920589SMauro Carvalho ChehabHow to Implement a new CPUFreq Processor Driver
5*8f920589SMauro Carvalho Chehab===============================================
6*8f920589SMauro Carvalho Chehab
7*8f920589SMauro Carvalho ChehabAuthors:
8*8f920589SMauro Carvalho Chehab
9*8f920589SMauro Carvalho Chehab
10*8f920589SMauro Carvalho Chehab	- Dominik Brodowski  <linux@brodo.de>
11*8f920589SMauro Carvalho Chehab	- Rafael J. Wysocki <rafael.j.wysocki@intel.com>
12*8f920589SMauro Carvalho Chehab	- Viresh Kumar <viresh.kumar@linaro.org>
13*8f920589SMauro Carvalho Chehab
14*8f920589SMauro Carvalho Chehab.. Contents
15*8f920589SMauro Carvalho Chehab
16*8f920589SMauro Carvalho Chehab   1.   What To Do?
17*8f920589SMauro Carvalho Chehab   1.1  Initialization
18*8f920589SMauro Carvalho Chehab   1.2  Per-CPU Initialization
19*8f920589SMauro Carvalho Chehab   1.3  verify
20*8f920589SMauro Carvalho Chehab   1.4  target/target_index or setpolicy?
21*8f920589SMauro Carvalho Chehab   1.5  target/target_index
22*8f920589SMauro Carvalho Chehab   1.6  setpolicy
23*8f920589SMauro Carvalho Chehab   1.7  get_intermediate and target_intermediate
24*8f920589SMauro Carvalho Chehab   2.   Frequency Table Helpers
25*8f920589SMauro Carvalho Chehab
26*8f920589SMauro Carvalho Chehab
27*8f920589SMauro Carvalho Chehab
28*8f920589SMauro Carvalho Chehab1. What To Do?
29*8f920589SMauro Carvalho Chehab==============
30*8f920589SMauro Carvalho Chehab
31*8f920589SMauro Carvalho ChehabSo, you just got a brand-new CPU / chipset with datasheets and want to
32*8f920589SMauro Carvalho Chehabadd cpufreq support for this CPU / chipset? Great. Here are some hints
33*8f920589SMauro Carvalho Chehabon what is necessary:
34*8f920589SMauro Carvalho Chehab
35*8f920589SMauro Carvalho Chehab
36*8f920589SMauro Carvalho Chehab1.1 Initialization
37*8f920589SMauro Carvalho Chehab------------------
38*8f920589SMauro Carvalho Chehab
39*8f920589SMauro Carvalho ChehabFirst of all, in an __initcall level 7 (module_init()) or later
40*8f920589SMauro Carvalho Chehabfunction check whether this kernel runs on the right CPU and the right
41*8f920589SMauro Carvalho Chehabchipset. If so, register a struct cpufreq_driver with the CPUfreq core
42*8f920589SMauro Carvalho Chehabusing cpufreq_register_driver()
43*8f920589SMauro Carvalho Chehab
44*8f920589SMauro Carvalho ChehabWhat shall this struct cpufreq_driver contain?
45*8f920589SMauro Carvalho Chehab
46*8f920589SMauro Carvalho Chehab .name - The name of this driver.
47*8f920589SMauro Carvalho Chehab
48*8f920589SMauro Carvalho Chehab .init - A pointer to the per-policy initialization function.
49*8f920589SMauro Carvalho Chehab
50*8f920589SMauro Carvalho Chehab .verify - A pointer to a "verification" function.
51*8f920589SMauro Carvalho Chehab
52*8f920589SMauro Carvalho Chehab .setpolicy _or_ .fast_switch _or_ .target _or_ .target_index - See
53*8f920589SMauro Carvalho Chehab below on the differences.
54*8f920589SMauro Carvalho Chehab
55*8f920589SMauro Carvalho ChehabAnd optionally
56*8f920589SMauro Carvalho Chehab
57*8f920589SMauro Carvalho Chehab .flags - Hints for the cpufreq core.
58*8f920589SMauro Carvalho Chehab
59*8f920589SMauro Carvalho Chehab .driver_data - cpufreq driver specific data.
60*8f920589SMauro Carvalho Chehab
61*8f920589SMauro Carvalho Chehab .resolve_freq - Returns the most appropriate frequency for a target
62*8f920589SMauro Carvalho Chehab frequency. Doesn't change the frequency though.
63*8f920589SMauro Carvalho Chehab
64*8f920589SMauro Carvalho Chehab .get_intermediate and target_intermediate - Used to switch to stable
65*8f920589SMauro Carvalho Chehab frequency while changing CPU frequency.
66*8f920589SMauro Carvalho Chehab
67*8f920589SMauro Carvalho Chehab .get - Returns current frequency of the CPU.
68*8f920589SMauro Carvalho Chehab
69*8f920589SMauro Carvalho Chehab .bios_limit - Returns HW/BIOS max frequency limitations for the CPU.
70*8f920589SMauro Carvalho Chehab
71*8f920589SMauro Carvalho Chehab .exit - A pointer to a per-policy cleanup function called during
72*8f920589SMauro Carvalho Chehab CPU_POST_DEAD phase of cpu hotplug process.
73*8f920589SMauro Carvalho Chehab
74*8f920589SMauro Carvalho Chehab .stop_cpu - A pointer to a per-policy stop function called during
75*8f920589SMauro Carvalho Chehab CPU_DOWN_PREPARE phase of cpu hotplug process.
76*8f920589SMauro Carvalho Chehab
77*8f920589SMauro Carvalho Chehab .suspend - A pointer to a per-policy suspend function which is called
78*8f920589SMauro Carvalho Chehab with interrupts disabled and _after_ the governor is stopped for the
79*8f920589SMauro Carvalho Chehab policy.
80*8f920589SMauro Carvalho Chehab
81*8f920589SMauro Carvalho Chehab .resume - A pointer to a per-policy resume function which is called
82*8f920589SMauro Carvalho Chehab with interrupts disabled and _before_ the governor is started again.
83*8f920589SMauro Carvalho Chehab
84*8f920589SMauro Carvalho Chehab .ready - A pointer to a per-policy ready function which is called after
85*8f920589SMauro Carvalho Chehab the policy is fully initialized.
86*8f920589SMauro Carvalho Chehab
87*8f920589SMauro Carvalho Chehab .attr - A pointer to a NULL-terminated list of "struct freq_attr" which
88*8f920589SMauro Carvalho Chehab allow to export values to sysfs.
89*8f920589SMauro Carvalho Chehab
90*8f920589SMauro Carvalho Chehab .boost_enabled - If set, boost frequencies are enabled.
91*8f920589SMauro Carvalho Chehab
92*8f920589SMauro Carvalho Chehab .set_boost - A pointer to a per-policy function to enable/disable boost
93*8f920589SMauro Carvalho Chehab frequencies.
94*8f920589SMauro Carvalho Chehab
95*8f920589SMauro Carvalho Chehab
96*8f920589SMauro Carvalho Chehab1.2 Per-CPU Initialization
97*8f920589SMauro Carvalho Chehab--------------------------
98*8f920589SMauro Carvalho Chehab
99*8f920589SMauro Carvalho ChehabWhenever a new CPU is registered with the device model, or after the
100*8f920589SMauro Carvalho Chehabcpufreq driver registers itself, the per-policy initialization function
101*8f920589SMauro Carvalho Chehabcpufreq_driver.init is called if no cpufreq policy existed for the CPU.
102*8f920589SMauro Carvalho ChehabNote that the .init() and .exit() routines are called only once for the
103*8f920589SMauro Carvalho Chehabpolicy and not for each CPU managed by the policy. It takes a ``struct
104*8f920589SMauro Carvalho Chehabcpufreq_policy *policy`` as argument. What to do now?
105*8f920589SMauro Carvalho Chehab
106*8f920589SMauro Carvalho ChehabIf necessary, activate the CPUfreq support on your CPU.
107*8f920589SMauro Carvalho Chehab
108*8f920589SMauro Carvalho ChehabThen, the driver must fill in the following values:
109*8f920589SMauro Carvalho Chehab
110*8f920589SMauro Carvalho Chehab+-----------------------------------+--------------------------------------+
111*8f920589SMauro Carvalho Chehab|policy->cpuinfo.min_freq _and_	    |					   |
112*8f920589SMauro Carvalho Chehab|policy->cpuinfo.max_freq	    | the minimum and maximum frequency	   |
113*8f920589SMauro Carvalho Chehab|				    | (in kHz) which is supported by	   |
114*8f920589SMauro Carvalho Chehab|				    | this CPU				   |
115*8f920589SMauro Carvalho Chehab+-----------------------------------+--------------------------------------+
116*8f920589SMauro Carvalho Chehab|policy->cpuinfo.transition_latency | the time it takes on this CPU to	   |
117*8f920589SMauro Carvalho Chehab|				    | switch between two frequencies in	   |
118*8f920589SMauro Carvalho Chehab|				    | nanoseconds (if appropriate, else	   |
119*8f920589SMauro Carvalho Chehab|				    | specify CPUFREQ_ETERNAL)		   |
120*8f920589SMauro Carvalho Chehab+-----------------------------------+--------------------------------------+
121*8f920589SMauro Carvalho Chehab|policy->cur			    | The current operating frequency of   |
122*8f920589SMauro Carvalho Chehab|				    | this CPU (if appropriate)		   |
123*8f920589SMauro Carvalho Chehab+-----------------------------------+--------------------------------------+
124*8f920589SMauro Carvalho Chehab|policy->min,			    |					   |
125*8f920589SMauro Carvalho Chehab|policy->max,			    |					   |
126*8f920589SMauro Carvalho Chehab|policy->policy and, if necessary,  |					   |
127*8f920589SMauro Carvalho Chehab|policy->governor		    | must contain the "default policy" for|
128*8f920589SMauro Carvalho Chehab|				    | this CPU. A few moments later,       |
129*8f920589SMauro Carvalho Chehab|				    | cpufreq_driver.verify and either     |
130*8f920589SMauro Carvalho Chehab|				    | cpufreq_driver.setpolicy or          |
131*8f920589SMauro Carvalho Chehab|				    | cpufreq_driver.target/target_index is|
132*8f920589SMauro Carvalho Chehab|				    | called with these values.		   |
133*8f920589SMauro Carvalho Chehab+-----------------------------------+--------------------------------------+
134*8f920589SMauro Carvalho Chehab|policy->cpus			    | Update this with the masks of the	   |
135*8f920589SMauro Carvalho Chehab|				    | (online + offline) CPUs that do DVFS |
136*8f920589SMauro Carvalho Chehab|				    | along with this CPU (i.e.  that share|
137*8f920589SMauro Carvalho Chehab|				    | clock/voltage rails with it).	   |
138*8f920589SMauro Carvalho Chehab+-----------------------------------+--------------------------------------+
139*8f920589SMauro Carvalho Chehab
140*8f920589SMauro Carvalho ChehabFor setting some of these values (cpuinfo.min[max]_freq, policy->min[max]), the
141*8f920589SMauro Carvalho Chehabfrequency table helpers might be helpful. See the section 2 for more information
142*8f920589SMauro Carvalho Chehabon them.
143*8f920589SMauro Carvalho Chehab
144*8f920589SMauro Carvalho Chehab
145*8f920589SMauro Carvalho Chehab1.3 verify
146*8f920589SMauro Carvalho Chehab----------
147*8f920589SMauro Carvalho Chehab
148*8f920589SMauro Carvalho ChehabWhen the user decides a new policy (consisting of
149*8f920589SMauro Carvalho Chehab"policy,governor,min,max") shall be set, this policy must be validated
150*8f920589SMauro Carvalho Chehabso that incompatible values can be corrected. For verifying these
151*8f920589SMauro Carvalho Chehabvalues cpufreq_verify_within_limits(``struct cpufreq_policy *policy``,
152*8f920589SMauro Carvalho Chehab``unsigned int min_freq``, ``unsigned int max_freq``) function might be helpful.
153*8f920589SMauro Carvalho ChehabSee section 2 for details on frequency table helpers.
154*8f920589SMauro Carvalho Chehab
155*8f920589SMauro Carvalho ChehabYou need to make sure that at least one valid frequency (or operating
156*8f920589SMauro Carvalho Chehabrange) is within policy->min and policy->max. If necessary, increase
157*8f920589SMauro Carvalho Chehabpolicy->max first, and only if this is no solution, decrease policy->min.
158*8f920589SMauro Carvalho Chehab
159*8f920589SMauro Carvalho Chehab
160*8f920589SMauro Carvalho Chehab1.4 target or target_index or setpolicy or fast_switch?
161*8f920589SMauro Carvalho Chehab-------------------------------------------------------
162*8f920589SMauro Carvalho Chehab
163*8f920589SMauro Carvalho ChehabMost cpufreq drivers or even most cpu frequency scaling algorithms
164*8f920589SMauro Carvalho Chehabonly allow the CPU frequency to be set to predefined fixed values. For
165*8f920589SMauro Carvalho Chehabthese, you use the ->target(), ->target_index() or ->fast_switch()
166*8f920589SMauro Carvalho Chehabcallbacks.
167*8f920589SMauro Carvalho Chehab
168*8f920589SMauro Carvalho ChehabSome cpufreq capable processors switch the frequency between certain
169*8f920589SMauro Carvalho Chehablimits on their own. These shall use the ->setpolicy() callback.
170*8f920589SMauro Carvalho Chehab
171*8f920589SMauro Carvalho Chehab
172*8f920589SMauro Carvalho Chehab1.5. target/target_index
173*8f920589SMauro Carvalho Chehab------------------------
174*8f920589SMauro Carvalho Chehab
175*8f920589SMauro Carvalho ChehabThe target_index call has two arguments: ``struct cpufreq_policy *policy``,
176*8f920589SMauro Carvalho Chehaband ``unsigned int`` index (into the exposed frequency table).
177*8f920589SMauro Carvalho Chehab
178*8f920589SMauro Carvalho ChehabThe CPUfreq driver must set the new frequency when called here. The
179*8f920589SMauro Carvalho Chehabactual frequency must be determined by freq_table[index].frequency.
180*8f920589SMauro Carvalho Chehab
181*8f920589SMauro Carvalho ChehabIt should always restore to earlier frequency (i.e. policy->restore_freq) in
182*8f920589SMauro Carvalho Chehabcase of errors, even if we switched to intermediate frequency earlier.
183*8f920589SMauro Carvalho Chehab
184*8f920589SMauro Carvalho ChehabDeprecated
185*8f920589SMauro Carvalho Chehab----------
186*8f920589SMauro Carvalho ChehabThe target call has three arguments: ``struct cpufreq_policy *policy``,
187*8f920589SMauro Carvalho Chehabunsigned int target_frequency, unsigned int relation.
188*8f920589SMauro Carvalho Chehab
189*8f920589SMauro Carvalho ChehabThe CPUfreq driver must set the new frequency when called here. The
190*8f920589SMauro Carvalho Chehabactual frequency must be determined using the following rules:
191*8f920589SMauro Carvalho Chehab
192*8f920589SMauro Carvalho Chehab- keep close to "target_freq"
193*8f920589SMauro Carvalho Chehab- policy->min <= new_freq <= policy->max (THIS MUST BE VALID!!!)
194*8f920589SMauro Carvalho Chehab- if relation==CPUFREQ_REL_L, try to select a new_freq higher than or equal
195*8f920589SMauro Carvalho Chehab  target_freq. ("L for lowest, but no lower than")
196*8f920589SMauro Carvalho Chehab- if relation==CPUFREQ_REL_H, try to select a new_freq lower than or equal
197*8f920589SMauro Carvalho Chehab  target_freq. ("H for highest, but no higher than")
198*8f920589SMauro Carvalho Chehab
199*8f920589SMauro Carvalho ChehabHere again the frequency table helper might assist you - see section 2
200*8f920589SMauro Carvalho Chehabfor details.
201*8f920589SMauro Carvalho Chehab
202*8f920589SMauro Carvalho Chehab1.6. fast_switch
203*8f920589SMauro Carvalho Chehab----------------
204*8f920589SMauro Carvalho Chehab
205*8f920589SMauro Carvalho ChehabThis function is used for frequency switching from scheduler's context.
206*8f920589SMauro Carvalho ChehabNot all drivers are expected to implement it, as sleeping from within
207*8f920589SMauro Carvalho Chehabthis callback isn't allowed. This callback must be highly optimized to
208*8f920589SMauro Carvalho Chehabdo switching as fast as possible.
209*8f920589SMauro Carvalho Chehab
210*8f920589SMauro Carvalho ChehabThis function has two arguments: ``struct cpufreq_policy *policy`` and
211*8f920589SMauro Carvalho Chehab``unsigned int target_frequency``.
212*8f920589SMauro Carvalho Chehab
213*8f920589SMauro Carvalho Chehab
214*8f920589SMauro Carvalho Chehab1.7 setpolicy
215*8f920589SMauro Carvalho Chehab-------------
216*8f920589SMauro Carvalho Chehab
217*8f920589SMauro Carvalho ChehabThe setpolicy call only takes a ``struct cpufreq_policy *policy`` as
218*8f920589SMauro Carvalho Chehabargument. You need to set the lower limit of the in-processor or
219*8f920589SMauro Carvalho Chehabin-chipset dynamic frequency switching to policy->min, the upper limit
220*8f920589SMauro Carvalho Chehabto policy->max, and -if supported- select a performance-oriented
221*8f920589SMauro Carvalho Chehabsetting when policy->policy is CPUFREQ_POLICY_PERFORMANCE, and a
222*8f920589SMauro Carvalho Chehabpowersaving-oriented setting when CPUFREQ_POLICY_POWERSAVE. Also check
223*8f920589SMauro Carvalho Chehabthe reference implementation in drivers/cpufreq/longrun.c
224*8f920589SMauro Carvalho Chehab
225*8f920589SMauro Carvalho Chehab1.8 get_intermediate and target_intermediate
226*8f920589SMauro Carvalho Chehab--------------------------------------------
227*8f920589SMauro Carvalho Chehab
228*8f920589SMauro Carvalho ChehabOnly for drivers with target_index() and CPUFREQ_ASYNC_NOTIFICATION unset.
229*8f920589SMauro Carvalho Chehab
230*8f920589SMauro Carvalho Chehabget_intermediate should return a stable intermediate frequency platform wants to
231*8f920589SMauro Carvalho Chehabswitch to, and target_intermediate() should set CPU to that frequency, before
232*8f920589SMauro Carvalho Chehabjumping to the frequency corresponding to 'index'. Core will take care of
233*8f920589SMauro Carvalho Chehabsending notifications and driver doesn't have to handle them in
234*8f920589SMauro Carvalho Chehabtarget_intermediate() or target_index().
235*8f920589SMauro Carvalho Chehab
236*8f920589SMauro Carvalho ChehabDrivers can return '0' from get_intermediate() in case they don't wish to switch
237*8f920589SMauro Carvalho Chehabto intermediate frequency for some target frequency. In that case core will
238*8f920589SMauro Carvalho Chehabdirectly call ->target_index().
239*8f920589SMauro Carvalho Chehab
240*8f920589SMauro Carvalho ChehabNOTE: ->target_index() should restore to policy->restore_freq in case of
241*8f920589SMauro Carvalho Chehabfailures as core would send notifications for that.
242*8f920589SMauro Carvalho Chehab
243*8f920589SMauro Carvalho Chehab
244*8f920589SMauro Carvalho Chehab2. Frequency Table Helpers
245*8f920589SMauro Carvalho Chehab==========================
246*8f920589SMauro Carvalho Chehab
247*8f920589SMauro Carvalho ChehabAs most cpufreq processors only allow for being set to a few specific
248*8f920589SMauro Carvalho Chehabfrequencies, a "frequency table" with some functions might assist in
249*8f920589SMauro Carvalho Chehabsome work of the processor driver. Such a "frequency table" consists of
250*8f920589SMauro Carvalho Chehaban array of struct cpufreq_frequency_table entries, with driver specific
251*8f920589SMauro Carvalho Chehabvalues in "driver_data", the corresponding frequency in "frequency" and
252*8f920589SMauro Carvalho Chehabflags set. At the end of the table, you need to add a
253*8f920589SMauro Carvalho Chehabcpufreq_frequency_table entry with frequency set to CPUFREQ_TABLE_END.
254*8f920589SMauro Carvalho ChehabAnd if you want to skip one entry in the table, set the frequency to
255*8f920589SMauro Carvalho ChehabCPUFREQ_ENTRY_INVALID. The entries don't need to be in sorted in any
256*8f920589SMauro Carvalho Chehabparticular order, but if they are cpufreq core will do DVFS a bit
257*8f920589SMauro Carvalho Chehabquickly for them as search for best match is faster.
258*8f920589SMauro Carvalho Chehab
259*8f920589SMauro Carvalho ChehabThe cpufreq table is verified automatically by the core if the policy contains a
260*8f920589SMauro Carvalho Chehabvalid pointer in its policy->freq_table field.
261*8f920589SMauro Carvalho Chehab
262*8f920589SMauro Carvalho Chehabcpufreq_frequency_table_verify() assures that at least one valid
263*8f920589SMauro Carvalho Chehabfrequency is within policy->min and policy->max, and all other criteria
264*8f920589SMauro Carvalho Chehabare met. This is helpful for the ->verify call.
265*8f920589SMauro Carvalho Chehab
266*8f920589SMauro Carvalho Chehabcpufreq_frequency_table_target() is the corresponding frequency table
267*8f920589SMauro Carvalho Chehabhelper for the ->target stage. Just pass the values to this function,
268*8f920589SMauro Carvalho Chehaband this function returns the of the frequency table entry which
269*8f920589SMauro Carvalho Chehabcontains the frequency the CPU shall be set to.
270*8f920589SMauro Carvalho Chehab
271*8f920589SMauro Carvalho ChehabThe following macros can be used as iterators over cpufreq_frequency_table:
272*8f920589SMauro Carvalho Chehab
273*8f920589SMauro Carvalho Chehabcpufreq_for_each_entry(pos, table) - iterates over all entries of frequency
274*8f920589SMauro Carvalho Chehabtable.
275*8f920589SMauro Carvalho Chehab
276*8f920589SMauro Carvalho Chehabcpufreq_for_each_valid_entry(pos, table) - iterates over all entries,
277*8f920589SMauro Carvalho Chehabexcluding CPUFREQ_ENTRY_INVALID frequencies.
278*8f920589SMauro Carvalho ChehabUse arguments "pos" - a ``cpufreq_frequency_table *`` as a loop cursor and
279*8f920589SMauro Carvalho Chehab"table" - the ``cpufreq_frequency_table *`` you want to iterate over.
280*8f920589SMauro Carvalho Chehab
281*8f920589SMauro Carvalho ChehabFor example::
282*8f920589SMauro Carvalho Chehab
283*8f920589SMauro Carvalho Chehab	struct cpufreq_frequency_table *pos, *driver_freq_table;
284*8f920589SMauro Carvalho Chehab
285*8f920589SMauro Carvalho Chehab	cpufreq_for_each_entry(pos, driver_freq_table) {
286*8f920589SMauro Carvalho Chehab		/* Do something with pos */
287*8f920589SMauro Carvalho Chehab		pos->frequency = ...
288*8f920589SMauro Carvalho Chehab	}
289*8f920589SMauro Carvalho Chehab
290*8f920589SMauro Carvalho ChehabIf you need to work with the position of pos within driver_freq_table,
291*8f920589SMauro Carvalho Chehabdo not subtract the pointers, as it is quite costly. Instead, use the
292*8f920589SMauro Carvalho Chehabmacros cpufreq_for_each_entry_idx() and cpufreq_for_each_valid_entry_idx().
293