1*ff61f079SJonathan Corbet.. SPDX-License-Identifier: GPL-2.0 2*ff61f079SJonathan Corbet 3*ff61f079SJonathan Corbet============================================================ 4*ff61f079SJonathan CorbetHardware-Feedback Interface for scheduling on Intel Hardware 5*ff61f079SJonathan Corbet============================================================ 6*ff61f079SJonathan Corbet 7*ff61f079SJonathan CorbetOverview 8*ff61f079SJonathan Corbet-------- 9*ff61f079SJonathan Corbet 10*ff61f079SJonathan CorbetIntel has described the Hardware Feedback Interface (HFI) in the Intel 64 and 11*ff61f079SJonathan CorbetIA-32 Architectures Software Developer's Manual (Intel SDM) Volume 3 Section 12*ff61f079SJonathan Corbet14.6 [1]_. 13*ff61f079SJonathan Corbet 14*ff61f079SJonathan CorbetThe HFI gives the operating system a performance and energy efficiency 15*ff61f079SJonathan Corbetcapability data for each CPU in the system. Linux can use the information from 16*ff61f079SJonathan Corbetthe HFI to influence task placement decisions. 17*ff61f079SJonathan Corbet 18*ff61f079SJonathan CorbetThe Hardware Feedback Interface 19*ff61f079SJonathan Corbet------------------------------- 20*ff61f079SJonathan Corbet 21*ff61f079SJonathan CorbetThe Hardware Feedback Interface provides to the operating system information 22*ff61f079SJonathan Corbetabout the performance and energy efficiency of each CPU in the system. Each 23*ff61f079SJonathan Corbetcapability is given as a unit-less quantity in the range [0-255]. Higher values 24*ff61f079SJonathan Corbetindicate higher capability. Energy efficiency and performance are reported in 25*ff61f079SJonathan Corbetseparate capabilities. Even though on some systems these two metrics may be 26*ff61f079SJonathan Corbetrelated, they are specified as independent capabilities in the Intel SDM. 27*ff61f079SJonathan Corbet 28*ff61f079SJonathan CorbetThese capabilities may change at runtime as a result of changes in the 29*ff61f079SJonathan Corbetoperating conditions of the system or the action of external factors. The rate 30*ff61f079SJonathan Corbetat which these capabilities are updated is specific to each processor model. On 31*ff61f079SJonathan Corbetsome models, capabilities are set at boot time and never change. On others, 32*ff61f079SJonathan Corbetcapabilities may change every tens of milliseconds. For instance, a remote 33*ff61f079SJonathan Corbetmechanism may be used to lower Thermal Design Power. Such change can be 34*ff61f079SJonathan Corbetreflected in the HFI. Likewise, if the system needs to be throttled due to 35*ff61f079SJonathan Corbetexcessive heat, the HFI may reflect reduced performance on specific CPUs. 36*ff61f079SJonathan Corbet 37*ff61f079SJonathan CorbetThe kernel or a userspace policy daemon can use these capabilities to modify 38*ff61f079SJonathan Corbettask placement decisions. For instance, if either the performance or energy 39*ff61f079SJonathan Corbetcapabilities of a given logical processor becomes zero, it is an indication that 40*ff61f079SJonathan Corbetthe hardware recommends to the operating system to not schedule any tasks on 41*ff61f079SJonathan Corbetthat processor for performance or energy efficiency reasons, respectively. 42*ff61f079SJonathan Corbet 43*ff61f079SJonathan CorbetImplementation details for Linux 44*ff61f079SJonathan Corbet-------------------------------- 45*ff61f079SJonathan Corbet 46*ff61f079SJonathan CorbetThe infrastructure to handle thermal event interrupts has two parts. In the 47*ff61f079SJonathan CorbetLocal Vector Table of a CPU's local APIC, there exists a register for the 48*ff61f079SJonathan CorbetThermal Monitor Register. This register controls how interrupts are delivered 49*ff61f079SJonathan Corbetto a CPU when the thermal monitor generates and interrupt. Further details 50*ff61f079SJonathan Corbetcan be found in the Intel SDM Vol. 3 Section 10.5 [1]_. 51*ff61f079SJonathan Corbet 52*ff61f079SJonathan CorbetThe thermal monitor may generate interrupts per CPU or per package. The HFI 53*ff61f079SJonathan Corbetgenerates package-level interrupts. This monitor is configured and initialized 54*ff61f079SJonathan Corbetvia a set of machine-specific registers. Specifically, the HFI interrupt and 55*ff61f079SJonathan Corbetstatus are controlled via designated bits in the IA32_PACKAGE_THERM_INTERRUPT 56*ff61f079SJonathan Corbetand IA32_PACKAGE_THERM_STATUS registers, respectively. There exists one HFI 57*ff61f079SJonathan Corbettable per package. Further details can be found in the Intel SDM Vol. 3 58*ff61f079SJonathan CorbetSection 14.9 [1]_. 59*ff61f079SJonathan Corbet 60*ff61f079SJonathan CorbetThe hardware issues an HFI interrupt after updating the HFI table and is ready 61*ff61f079SJonathan Corbetfor the operating system to consume it. CPUs receive such interrupt via the 62*ff61f079SJonathan Corbetthermal entry in the Local APIC's Local Vector Table. 63*ff61f079SJonathan Corbet 64*ff61f079SJonathan CorbetWhen servicing such interrupt, the HFI driver parses the updated table and 65*ff61f079SJonathan Corbetrelays the update to userspace using the thermal notification framework. Given 66*ff61f079SJonathan Corbetthat there may be many HFI updates every second, the updates relayed to 67*ff61f079SJonathan Corbetuserspace are throttled at a rate of CONFIG_HZ jiffies. 68*ff61f079SJonathan Corbet 69*ff61f079SJonathan CorbetReferences 70*ff61f079SJonathan Corbet---------- 71*ff61f079SJonathan Corbet 72*ff61f079SJonathan Corbet.. [1] https://www.intel.com/sdm 73