1*ff61f079SJonathan Corbet.. SPDX-License-Identifier: GPL-2.0
2*ff61f079SJonathan Corbet
3*ff61f079SJonathan Corbet============================================================
4*ff61f079SJonathan CorbetHardware-Feedback Interface for scheduling on Intel Hardware
5*ff61f079SJonathan Corbet============================================================
6*ff61f079SJonathan Corbet
7*ff61f079SJonathan CorbetOverview
8*ff61f079SJonathan Corbet--------
9*ff61f079SJonathan Corbet
10*ff61f079SJonathan CorbetIntel has described the Hardware Feedback Interface (HFI) in the Intel 64 and
11*ff61f079SJonathan CorbetIA-32 Architectures Software Developer's Manual (Intel SDM) Volume 3 Section
12*ff61f079SJonathan Corbet14.6 [1]_.
13*ff61f079SJonathan Corbet
14*ff61f079SJonathan CorbetThe HFI gives the operating system a performance and energy efficiency
15*ff61f079SJonathan Corbetcapability data for each CPU in the system. Linux can use the information from
16*ff61f079SJonathan Corbetthe HFI to influence task placement decisions.
17*ff61f079SJonathan Corbet
18*ff61f079SJonathan CorbetThe Hardware Feedback Interface
19*ff61f079SJonathan Corbet-------------------------------
20*ff61f079SJonathan Corbet
21*ff61f079SJonathan CorbetThe Hardware Feedback Interface provides to the operating system information
22*ff61f079SJonathan Corbetabout the performance and energy efficiency of each CPU in the system. Each
23*ff61f079SJonathan Corbetcapability is given as a unit-less quantity in the range [0-255]. Higher values
24*ff61f079SJonathan Corbetindicate higher capability. Energy efficiency and performance are reported in
25*ff61f079SJonathan Corbetseparate capabilities. Even though on some systems these two metrics may be
26*ff61f079SJonathan Corbetrelated, they are specified as independent capabilities in the Intel SDM.
27*ff61f079SJonathan Corbet
28*ff61f079SJonathan CorbetThese capabilities may change at runtime as a result of changes in the
29*ff61f079SJonathan Corbetoperating conditions of the system or the action of external factors. The rate
30*ff61f079SJonathan Corbetat which these capabilities are updated is specific to each processor model. On
31*ff61f079SJonathan Corbetsome models, capabilities are set at boot time and never change. On others,
32*ff61f079SJonathan Corbetcapabilities may change every tens of milliseconds. For instance, a remote
33*ff61f079SJonathan Corbetmechanism may be used to lower Thermal Design Power. Such change can be
34*ff61f079SJonathan Corbetreflected in the HFI. Likewise, if the system needs to be throttled due to
35*ff61f079SJonathan Corbetexcessive heat, the HFI may reflect reduced performance on specific CPUs.
36*ff61f079SJonathan Corbet
37*ff61f079SJonathan CorbetThe kernel or a userspace policy daemon can use these capabilities to modify
38*ff61f079SJonathan Corbettask placement decisions. For instance, if either the performance or energy
39*ff61f079SJonathan Corbetcapabilities of a given logical processor becomes zero, it is an indication that
40*ff61f079SJonathan Corbetthe hardware recommends to the operating system to not schedule any tasks on
41*ff61f079SJonathan Corbetthat processor for performance or energy efficiency reasons, respectively.
42*ff61f079SJonathan Corbet
43*ff61f079SJonathan CorbetImplementation details for Linux
44*ff61f079SJonathan Corbet--------------------------------
45*ff61f079SJonathan Corbet
46*ff61f079SJonathan CorbetThe infrastructure to handle thermal event interrupts has two parts. In the
47*ff61f079SJonathan CorbetLocal Vector Table of a CPU's local APIC, there exists a register for the
48*ff61f079SJonathan CorbetThermal Monitor Register. This register controls how interrupts are delivered
49*ff61f079SJonathan Corbetto a CPU when the thermal monitor generates and interrupt. Further details
50*ff61f079SJonathan Corbetcan be found in the Intel SDM Vol. 3 Section 10.5 [1]_.
51*ff61f079SJonathan Corbet
52*ff61f079SJonathan CorbetThe thermal monitor may generate interrupts per CPU or per package. The HFI
53*ff61f079SJonathan Corbetgenerates package-level interrupts. This monitor is configured and initialized
54*ff61f079SJonathan Corbetvia a set of machine-specific registers. Specifically, the HFI interrupt and
55*ff61f079SJonathan Corbetstatus are controlled via designated bits in the IA32_PACKAGE_THERM_INTERRUPT
56*ff61f079SJonathan Corbetand IA32_PACKAGE_THERM_STATUS registers, respectively. There exists one HFI
57*ff61f079SJonathan Corbettable per package. Further details can be found in the Intel SDM Vol. 3
58*ff61f079SJonathan CorbetSection 14.9 [1]_.
59*ff61f079SJonathan Corbet
60*ff61f079SJonathan CorbetThe hardware issues an HFI interrupt after updating the HFI table and is ready
61*ff61f079SJonathan Corbetfor the operating system to consume it. CPUs receive such interrupt via the
62*ff61f079SJonathan Corbetthermal entry in the Local APIC's Local Vector Table.
63*ff61f079SJonathan Corbet
64*ff61f079SJonathan CorbetWhen servicing such interrupt, the HFI driver parses the updated table and
65*ff61f079SJonathan Corbetrelays the update to userspace using the thermal notification framework. Given
66*ff61f079SJonathan Corbetthat there may be many HFI updates every second, the updates relayed to
67*ff61f079SJonathan Corbetuserspace are throttled at a rate of CONFIG_HZ jiffies.
68*ff61f079SJonathan Corbet
69*ff61f079SJonathan CorbetReferences
70*ff61f079SJonathan Corbet----------
71*ff61f079SJonathan Corbet
72*ff61f079SJonathan Corbet.. [1] https://www.intel.com/sdm
73