1===================================
2Generic Thermal Sysfs driver How To
3===================================
4
5Written by Sujith Thomas <sujith.thomas@intel.com>, Zhang Rui <rui.zhang@intel.com>
6
7Updated: 2 January 2008
8
9Copyright (c)  2008 Intel Corporation
10
11
120. Introduction
13===============
14
15The generic thermal sysfs provides a set of interfaces for thermal zone
16devices (sensors) and thermal cooling devices (fan, processor...) to register
17with the thermal management solution and to be a part of it.
18
19This how-to focuses on enabling new thermal zone and cooling devices to
20participate in thermal management.
21This solution is platform independent and any type of thermal zone devices
22and cooling devices should be able to make use of the infrastructure.
23
24The main task of the thermal sysfs driver is to expose thermal zone attributes
25as well as cooling device attributes to the user space.
26An intelligent thermal management application can make decisions based on
27inputs from thermal zone attributes (the current temperature and trip point
28temperature) and throttle appropriate devices.
29
30- `[0-*]`	denotes any positive number starting from 0
31- `[1-*]`	denotes any positive number starting from 1
32
331. thermal sysfs driver interface functions
34===========================================
35
361.1 thermal zone device interface
37---------------------------------
38
39    ::
40
41	struct thermal_zone_device
42	*thermal_zone_device_register(char *type,
43				      int trips, int mask, void *devdata,
44				      struct thermal_zone_device_ops *ops,
45				      const struct thermal_zone_params *tzp,
46				      int passive_delay, int polling_delay))
47
48    This interface function adds a new thermal zone device (sensor) to
49    /sys/class/thermal folder as `thermal_zone[0-*]`. It tries to bind all the
50    thermal cooling devices registered at the same time.
51
52    type:
53	the thermal zone type.
54    trips:
55	the total number of trip points this thermal zone supports.
56    mask:
57	Bit string: If 'n'th bit is set, then trip point 'n' is writable.
58    devdata:
59	device private data
60    ops:
61	thermal zone device call-backs.
62
63	.bind:
64		bind the thermal zone device with a thermal cooling device.
65	.unbind:
66		unbind the thermal zone device with a thermal cooling device.
67	.get_temp:
68		get the current temperature of the thermal zone.
69	.set_trips:
70		    set the trip points window. Whenever the current temperature
71		    is updated, the trip points immediately below and above the
72		    current temperature are found.
73	.get_mode:
74		   get the current mode (enabled/disabled) of the thermal zone.
75
76			- "enabled" means the kernel thermal management is
77			  enabled.
78			- "disabled" will prevent kernel thermal driver action
79			  upon trip points so that user applications can take
80			  charge of thermal management.
81	.set_mode:
82		set the mode (enabled/disabled) of the thermal zone.
83	.get_trip_type:
84		get the type of certain trip point.
85	.get_trip_temp:
86			get the temperature above which the certain trip point
87			will be fired.
88	.set_emul_temp:
89			set the emulation temperature which helps in debugging
90			different threshold temperature points.
91    tzp:
92	thermal zone platform parameters.
93    passive_delay:
94	number of milliseconds to wait between polls when
95	performing passive cooling.
96    polling_delay:
97	number of milliseconds to wait between polls when checking
98	whether trip points have been crossed (0 for interrupt driven systems).
99
100    ::
101
102	void thermal_zone_device_unregister(struct thermal_zone_device *tz)
103
104    This interface function removes the thermal zone device.
105    It deletes the corresponding entry from /sys/class/thermal folder and
106    unbinds all the thermal cooling devices it uses.
107
108	::
109
110	   struct thermal_zone_device
111	   *thermal_zone_of_sensor_register(struct device *dev, int sensor_id,
112				void *data,
113				const struct thermal_zone_of_device_ops *ops)
114
115	This interface adds a new sensor to a DT thermal zone.
116	This function will search the list of thermal zones described in
117	device tree and look for the zone that refer to the sensor device
118	pointed by dev->of_node as temperature providers. For the zone
119	pointing to the sensor node, the sensor will be added to the DT
120	thermal zone device.
121
122	The parameters for this interface are:
123
124	dev:
125			Device node of sensor containing valid node pointer in
126			dev->of_node.
127	sensor_id:
128			a sensor identifier, in case the sensor IP has more
129			than one sensors
130	data:
131			a private pointer (owned by the caller) that will be
132			passed back, when a temperature reading is needed.
133	ops:
134			`struct thermal_zone_of_device_ops *`.
135
136			==============  =======================================
137			get_temp	a pointer to a function that reads the
138					sensor temperature. This is mandatory
139					callback provided by sensor driver.
140			set_trips	a pointer to a function that sets a
141					temperature window. When this window is
142					left the driver must inform the thermal
143					core via thermal_zone_device_update.
144			get_trend 	a pointer to a function that reads the
145					sensor temperature trend.
146			set_emul_temp	a pointer to a function that sets
147					sensor emulated temperature.
148			==============  =======================================
149
150	The thermal zone temperature is provided by the get_temp() function
151	pointer of thermal_zone_of_device_ops. When called, it will
152	have the private pointer @data back.
153
154	It returns error pointer if fails otherwise valid thermal zone device
155	handle. Caller should check the return handle with IS_ERR() for finding
156	whether success or not.
157
158	::
159
160	    void thermal_zone_of_sensor_unregister(struct device *dev,
161						   struct thermal_zone_device *tzd)
162
163	This interface unregisters a sensor from a DT thermal zone which was
164	successfully added by interface thermal_zone_of_sensor_register().
165	This function removes the sensor callbacks and private data from the
166	thermal zone device registered with thermal_zone_of_sensor_register()
167	interface. It will also silent the zone by remove the .get_temp() and
168	get_trend() thermal zone device callbacks.
169
170	::
171
172	  struct thermal_zone_device
173	  *devm_thermal_zone_of_sensor_register(struct device *dev,
174				int sensor_id,
175				void *data,
176				const struct thermal_zone_of_device_ops *ops)
177
178	This interface is resource managed version of
179	thermal_zone_of_sensor_register().
180
181	All details of thermal_zone_of_sensor_register() described in
182	section 1.1.3 is applicable here.
183
184	The benefit of using this interface to register sensor is that it
185	is not require to explicitly call thermal_zone_of_sensor_unregister()
186	in error path or during driver unbinding as this is done by driver
187	resource manager.
188
189	::
190
191		void devm_thermal_zone_of_sensor_unregister(struct device *dev,
192						struct thermal_zone_device *tzd)
193
194	This interface is resource managed version of
195	thermal_zone_of_sensor_unregister().
196	All details of thermal_zone_of_sensor_unregister() described in
197	section 1.1.4 is applicable here.
198	Normally this function will not need to be called and the resource
199	management code will ensure that the resource is freed.
200
201	::
202
203		int thermal_zone_get_slope(struct thermal_zone_device *tz)
204
205	This interface is used to read the slope attribute value
206	for the thermal zone device, which might be useful for platform
207	drivers for temperature calculations.
208
209	::
210
211		int thermal_zone_get_offset(struct thermal_zone_device *tz)
212
213	This interface is used to read the offset attribute value
214	for the thermal zone device, which might be useful for platform
215	drivers for temperature calculations.
216
2171.2 thermal cooling device interface
218------------------------------------
219
220
221    ::
222
223	struct thermal_cooling_device
224	*thermal_cooling_device_register(char *name,
225			void *devdata, struct thermal_cooling_device_ops *)
226
227    This interface function adds a new thermal cooling device (fan/processor/...)
228    to /sys/class/thermal/ folder as `cooling_device[0-*]`. It tries to bind itself
229    to all the thermal zone devices registered at the same time.
230
231    name:
232	the cooling device name.
233    devdata:
234	device private data.
235    ops:
236	thermal cooling devices call-backs.
237
238	.get_max_state:
239		get the Maximum throttle state of the cooling device.
240	.get_cur_state:
241		get the Currently requested throttle state of the
242		cooling device.
243	.set_cur_state:
244		set the Current throttle state of the cooling device.
245
246    ::
247
248	void thermal_cooling_device_unregister(struct thermal_cooling_device *cdev)
249
250    This interface function removes the thermal cooling device.
251    It deletes the corresponding entry from /sys/class/thermal folder and
252    unbinds itself from all the thermal zone devices using it.
253
2541.3 interface for binding a thermal zone device with a thermal cooling device
255-----------------------------------------------------------------------------
256
257    ::
258
259	int thermal_zone_bind_cooling_device(struct thermal_zone_device *tz,
260		int trip, struct thermal_cooling_device *cdev,
261		unsigned long upper, unsigned long lower, unsigned int weight);
262
263    This interface function binds a thermal cooling device to a particular trip
264    point of a thermal zone device.
265
266    This function is usually called in the thermal zone device .bind callback.
267
268    tz:
269	  the thermal zone device
270    cdev:
271	  thermal cooling device
272    trip:
273	  indicates which trip point in this thermal zone the cooling device
274	  is associated with.
275    upper:
276	  the Maximum cooling state for this trip point.
277	  THERMAL_NO_LIMIT means no upper limit,
278	  and the cooling device can be in max_state.
279    lower:
280	  the Minimum cooling state can be used for this trip point.
281	  THERMAL_NO_LIMIT means no lower limit,
282	  and the cooling device can be in cooling state 0.
283    weight:
284	  the influence of this cooling device in this thermal
285	  zone.  See 1.4.1 below for more information.
286
287    ::
288
289	int thermal_zone_unbind_cooling_device(struct thermal_zone_device *tz,
290				int trip, struct thermal_cooling_device *cdev);
291
292    This interface function unbinds a thermal cooling device from a particular
293    trip point of a thermal zone device. This function is usually called in
294    the thermal zone device .unbind callback.
295
296    tz:
297	the thermal zone device
298    cdev:
299	thermal cooling device
300    trip:
301	indicates which trip point in this thermal zone the cooling device
302	is associated with.
303
3041.4 Thermal Zone Parameters
305---------------------------
306
307    ::
308
309	struct thermal_bind_params
310
311    This structure defines the following parameters that are used to bind
312    a zone with a cooling device for a particular trip point.
313
314    .cdev:
315	     The cooling device pointer
316    .weight:
317	     The 'influence' of a particular cooling device on this
318	     zone. This is relative to the rest of the cooling
319	     devices. For example, if all cooling devices have a
320	     weight of 1, then they all contribute the same. You can
321	     use percentages if you want, but it's not mandatory. A
322	     weight of 0 means that this cooling device doesn't
323	     contribute to the cooling of this zone unless all cooling
324	     devices have a weight of 0. If all weights are 0, then
325	     they all contribute the same.
326    .trip_mask:
327	       This is a bit mask that gives the binding relation between
328	       this thermal zone and cdev, for a particular trip point.
329	       If nth bit is set, then the cdev and thermal zone are bound
330	       for trip point n.
331    .binding_limits:
332		     This is an array of cooling state limits. Must have
333		     exactly 2 * thermal_zone.number_of_trip_points. It is an
334		     array consisting of tuples <lower-state upper-state> of
335		     state limits. Each trip will be associated with one state
336		     limit tuple when binding. A NULL pointer means
337		     <THERMAL_NO_LIMITS THERMAL_NO_LIMITS> on all trips.
338		     These limits are used when binding a cdev to a trip point.
339    .match:
340	    This call back returns success(0) if the 'tz and cdev' need to
341	    be bound, as per platform data.
342
343    ::
344
345	struct thermal_zone_params
346
347    This structure defines the platform level parameters for a thermal zone.
348    This data, for each thermal zone should come from the platform layer.
349    This is an optional feature where some platforms can choose not to
350    provide this data.
351
352    .governor_name:
353	       Name of the thermal governor used for this zone
354    .no_hwmon:
355	       a boolean to indicate if the thermal to hwmon sysfs interface
356	       is required. when no_hwmon == false, a hwmon sysfs interface
357	       will be created. when no_hwmon == true, nothing will be done.
358	       In case the thermal_zone_params is NULL, the hwmon interface
359	       will be created (for backward compatibility).
360    .num_tbps:
361	       Number of thermal_bind_params entries for this zone
362    .tbp:
363	       thermal_bind_params entries
364
3652. sysfs attributes structure
366=============================
367
368==	================
369RO	read only value
370WO	write only value
371RW	read/write value
372==	================
373
374Thermal sysfs attributes will be represented under /sys/class/thermal.
375Hwmon sysfs I/F extension is also available under /sys/class/hwmon
376if hwmon is compiled in or built as a module.
377
378Thermal zone device sys I/F, created once it's registered::
379
380  /sys/class/thermal/thermal_zone[0-*]:
381    |---type:			Type of the thermal zone
382    |---temp:			Current temperature
383    |---mode:			Working mode of the thermal zone
384    |---policy:			Thermal governor used for this zone
385    |---available_policies:	Available thermal governors for this zone
386    |---trip_point_[0-*]_temp:	Trip point temperature
387    |---trip_point_[0-*]_type:	Trip point type
388    |---trip_point_[0-*]_hyst:	Hysteresis value for this trip point
389    |---emul_temp:		Emulated temperature set node
390    |---sustainable_power:      Sustainable dissipatable power
391    |---k_po:                   Proportional term during temperature overshoot
392    |---k_pu:                   Proportional term during temperature undershoot
393    |---k_i:                    PID's integral term in the power allocator gov
394    |---k_d:                    PID's derivative term in the power allocator
395    |---integral_cutoff:        Offset above which errors are accumulated
396    |---slope:                  Slope constant applied as linear extrapolation
397    |---offset:                 Offset constant applied as linear extrapolation
398
399Thermal cooling device sys I/F, created once it's registered::
400
401  /sys/class/thermal/cooling_device[0-*]:
402    |---type:			Type of the cooling device(processor/fan/...)
403    |---max_state:		Maximum cooling state of the cooling device
404    |---cur_state:		Current cooling state of the cooling device
405    |---stats:			Directory containing cooling device's statistics
406    |---stats/reset:		Writing any value resets the statistics
407    |---stats/time_in_state_ms:	Time (msec) spent in various cooling states
408    |---stats/total_trans:	Total number of times cooling state is changed
409    |---stats/trans_table:	Cooling state transition table
410
411
412Then next two dynamic attributes are created/removed in pairs. They represent
413the relationship between a thermal zone and its associated cooling device.
414They are created/removed for each successful execution of
415thermal_zone_bind_cooling_device/thermal_zone_unbind_cooling_device.
416
417::
418
419  /sys/class/thermal/thermal_zone[0-*]:
420    |---cdev[0-*]:		[0-*]th cooling device in current thermal zone
421    |---cdev[0-*]_trip_point:	Trip point that cdev[0-*] is associated with
422    |---cdev[0-*]_weight:       Influence of the cooling device in
423				this thermal zone
424
425Besides the thermal zone device sysfs I/F and cooling device sysfs I/F,
426the generic thermal driver also creates a hwmon sysfs I/F for each _type_
427of thermal zone device. E.g. the generic thermal driver registers one hwmon
428class device and build the associated hwmon sysfs I/F for all the registered
429ACPI thermal zones.
430
431::
432
433  /sys/class/hwmon/hwmon[0-*]:
434    |---name:			The type of the thermal zone devices
435    |---temp[1-*]_input:	The current temperature of thermal zone [1-*]
436    |---temp[1-*]_critical:	The critical trip point of thermal zone [1-*]
437
438Please read Documentation/hwmon/sysfs-interface.rst for additional information.
439
440Thermal zone attributes
441-----------------------
442
443type
444	Strings which represent the thermal zone type.
445	This is given by thermal zone driver as part of registration.
446	E.g: "acpitz" indicates it's an ACPI thermal device.
447	In order to keep it consistent with hwmon sys attribute; this should
448	be a short, lowercase string, not containing spaces nor dashes.
449	RO, Required
450
451temp
452	Current temperature as reported by thermal zone (sensor).
453	Unit: millidegree Celsius
454	RO, Required
455
456mode
457	One of the predefined values in [enabled, disabled].
458	This file gives information about the algorithm that is currently
459	managing the thermal zone. It can be either default kernel based
460	algorithm or user space application.
461
462	enabled
463			  enable Kernel Thermal management.
464	disabled
465			  Preventing kernel thermal zone driver actions upon
466			  trip points so that user application can take full
467			  charge of the thermal management.
468
469	RW, Optional
470
471policy
472	One of the various thermal governors used for a particular zone.
473
474	RW, Required
475
476available_policies
477	Available thermal governors which can be used for a particular zone.
478
479	RO, Required
480
481`trip_point_[0-*]_temp`
482	The temperature above which trip point will be fired.
483
484	Unit: millidegree Celsius
485
486	RO, Optional
487
488`trip_point_[0-*]_type`
489	Strings which indicate the type of the trip point.
490
491	E.g. it can be one of critical, hot, passive, `active[0-*]` for ACPI
492	thermal zone.
493
494	RO, Optional
495
496`trip_point_[0-*]_hyst`
497	The hysteresis value for a trip point, represented as an integer
498	Unit: Celsius
499	RW, Optional
500
501`cdev[0-*]`
502	Sysfs link to the thermal cooling device node where the sys I/F
503	for cooling device throttling control represents.
504
505	RO, Optional
506
507`cdev[0-*]_trip_point`
508	The trip point in this thermal zone which `cdev[0-*]` is associated
509	with; -1 means the cooling device is not associated with any trip
510	point.
511
512	RO, Optional
513
514`cdev[0-*]_weight`
515	The influence of `cdev[0-*]` in this thermal zone. This value
516	is relative to the rest of cooling devices in the thermal
517	zone. For example, if a cooling device has a weight double
518	than that of other, it's twice as effective in cooling the
519	thermal zone.
520
521	RW, Optional
522
523emul_temp
524	Interface to set the emulated temperature method in thermal zone
525	(sensor). After setting this temperature, the thermal zone may pass
526	this temperature to platform emulation function if registered or
527	cache it locally. This is useful in debugging different temperature
528	threshold and its associated cooling action. This is write only node
529	and writing 0 on this node should disable emulation.
530	Unit: millidegree Celsius
531
532	WO, Optional
533
534	  WARNING:
535	    Be careful while enabling this option on production systems,
536	    because userland can easily disable the thermal policy by simply
537	    flooding this sysfs node with low temperature values.
538
539sustainable_power
540	An estimate of the sustained power that can be dissipated by
541	the thermal zone. Used by the power allocator governor. For
542	more information see Documentation/driver-api/thermal/power_allocator.rst
543
544	Unit: milliwatts
545
546	RW, Optional
547
548k_po
549	The proportional term of the power allocator governor's PID
550	controller during temperature overshoot. Temperature overshoot
551	is when the current temperature is above the "desired
552	temperature" trip point. For more information see
553	Documentation/driver-api/thermal/power_allocator.rst
554
555	RW, Optional
556
557k_pu
558	The proportional term of the power allocator governor's PID
559	controller during temperature undershoot. Temperature undershoot
560	is when the current temperature is below the "desired
561	temperature" trip point. For more information see
562	Documentation/driver-api/thermal/power_allocator.rst
563
564	RW, Optional
565
566k_i
567	The integral term of the power allocator governor's PID
568	controller. This term allows the PID controller to compensate
569	for long term drift. For more information see
570	Documentation/driver-api/thermal/power_allocator.rst
571
572	RW, Optional
573
574k_d
575	The derivative term of the power allocator governor's PID
576	controller. For more information see
577	Documentation/driver-api/thermal/power_allocator.rst
578
579	RW, Optional
580
581integral_cutoff
582	Temperature offset from the desired temperature trip point
583	above which the integral term of the power allocator
584	governor's PID controller starts accumulating errors. For
585	example, if integral_cutoff is 0, then the integral term only
586	accumulates error when temperature is above the desired
587	temperature trip point. For more information see
588	Documentation/driver-api/thermal/power_allocator.rst
589
590	Unit: millidegree Celsius
591
592	RW, Optional
593
594slope
595	The slope constant used in a linear extrapolation model
596	to determine a hotspot temperature based off the sensor's
597	raw readings. It is up to the device driver to determine
598	the usage of these values.
599
600	RW, Optional
601
602offset
603	The offset constant used in a linear extrapolation model
604	to determine a hotspot temperature based off the sensor's
605	raw readings. It is up to the device driver to determine
606	the usage of these values.
607
608	RW, Optional
609
610Cooling device attributes
611-------------------------
612
613type
614	String which represents the type of device, e.g:
615
616	- for generic ACPI: should be "Fan", "Processor" or "LCD"
617	- for memory controller device on intel_menlow platform:
618	  should be "Memory controller".
619
620	RO, Required
621
622max_state
623	The maximum permissible cooling state of this cooling device.
624
625	RO, Required
626
627cur_state
628	The current cooling state of this cooling device.
629	The value can any integer numbers between 0 and max_state:
630
631	- cur_state == 0 means no cooling
632	- cur_state == max_state means the maximum cooling.
633
634	RW, Required
635
636stats/reset
637	Writing any value resets the cooling device's statistics.
638	WO, Required
639
640stats/time_in_state_ms:
641	The amount of time spent by the cooling device in various cooling
642	states. The output will have "<state> <time>" pair in each line, which
643	will mean this cooling device spent <time> msec of time at <state>.
644	Output will have one line for each of the supported states.
645	RO, Required
646
647
648stats/total_trans:
649	A single positive value showing the total number of times the state of a
650	cooling device is changed.
651
652	RO, Required
653
654stats/trans_table:
655	This gives fine grained information about all the cooling state
656	transitions. The cat output here is a two dimensional matrix, where an
657	entry <i,j> (row i, column j) represents the number of transitions from
658	State_i to State_j. If the transition table is bigger than PAGE_SIZE,
659	reading this will return an -EFBIG error.
660	RO, Required
661
6623. A simple implementation
663==========================
664
665ACPI thermal zone may support multiple trip points like critical, hot,
666passive, active. If an ACPI thermal zone supports critical, passive,
667active[0] and active[1] at the same time, it may register itself as a
668thermal_zone_device (thermal_zone1) with 4 trip points in all.
669It has one processor and one fan, which are both registered as
670thermal_cooling_device. Both are considered to have the same
671effectiveness in cooling the thermal zone.
672
673If the processor is listed in _PSL method, and the fan is listed in _AL0
674method, the sys I/F structure will be built like this::
675
676 /sys/class/thermal:
677  |thermal_zone1:
678    |---type:			acpitz
679    |---temp:			37000
680    |---mode:			enabled
681    |---policy:			step_wise
682    |---available_policies:	step_wise fair_share
683    |---trip_point_0_temp:	100000
684    |---trip_point_0_type:	critical
685    |---trip_point_1_temp:	80000
686    |---trip_point_1_type:	passive
687    |---trip_point_2_temp:	70000
688    |---trip_point_2_type:	active0
689    |---trip_point_3_temp:	60000
690    |---trip_point_3_type:	active1
691    |---cdev0:			--->/sys/class/thermal/cooling_device0
692    |---cdev0_trip_point:	1	/* cdev0 can be used for passive */
693    |---cdev0_weight:           1024
694    |---cdev1:			--->/sys/class/thermal/cooling_device3
695    |---cdev1_trip_point:	2	/* cdev1 can be used for active[0]*/
696    |---cdev1_weight:           1024
697
698  |cooling_device0:
699    |---type:			Processor
700    |---max_state:		8
701    |---cur_state:		0
702
703  |cooling_device3:
704    |---type:			Fan
705    |---max_state:		2
706    |---cur_state:		0
707
708 /sys/class/hwmon:
709  |hwmon0:
710    |---name:			acpitz
711    |---temp1_input:		37000
712    |---temp1_crit:		100000
713
7144. Export Symbol APIs
715=====================
716
7174.1. get_tz_trend
718-----------------
719
720This function returns the trend of a thermal zone, i.e the rate of change
721of temperature of the thermal zone. Ideally, the thermal sensor drivers
722are supposed to implement the callback. If they don't, the thermal
723framework calculated the trend by comparing the previous and the current
724temperature values.
725
7264.2. get_thermal_instance
727-------------------------
728
729This function returns the thermal_instance corresponding to a given
730{thermal_zone, cooling_device, trip_point} combination. Returns NULL
731if such an instance does not exist.
732
7334.3. thermal_cdev_update
734------------------------
735
736This function serves as an arbitrator to set the state of a cooling
737device. It sets the cooling device to the deepest cooling state if
738possible.
739
7405. thermal_emergency_poweroff
741=============================
742
743On an event of critical trip temperature crossing the thermal framework
744shuts down the system by calling hw_protection_shutdown(). The
745hw_protection_shutdown() first attempts to perform an orderly shutdown
746but accepts a delay after which it proceeds doing a forced power-off
747or as last resort an emergency_restart.
748
749The delay should be carefully profiled so as to give adequate time for
750orderly poweroff.
751
752If the delay is set to 0 emergency poweroff will not be supported. So a
753carefully profiled non-zero positive value is a must for emergency
754poweroff to be triggered.
755