1213081daSSrinivas Pandruvada.. SPDX-License-Identifier: GPL-2.0 2213081daSSrinivas Pandruvada 3213081daSSrinivas Pandruvada============================================================ 4213081daSSrinivas PandruvadaIntel(R) Speed Select Technology User Guide 5213081daSSrinivas Pandruvada============================================================ 6213081daSSrinivas Pandruvada 7213081daSSrinivas PandruvadaThe Intel(R) Speed Select Technology (Intel(R) SST) provides a powerful new 8213081daSSrinivas Pandruvadacollection of features that give more granular control over CPU performance. 9213081daSSrinivas PandruvadaWith Intel(R) SST, one server can be configured for power and performance for a 10213081daSSrinivas Pandruvadavariety of diverse workload requirements. 11213081daSSrinivas Pandruvada 12213081daSSrinivas PandruvadaRefer to the links below for an overview of the technology: 13213081daSSrinivas Pandruvada 14213081daSSrinivas Pandruvada- https://www.intel.com/content/www/us/en/architecture-and-technology/speed-select-technology-article.html 15213081daSSrinivas Pandruvada- https://builders.intel.com/docs/networkbuilders/intel-speed-select-technology-base-frequency-enhancing-performance.pdf 16213081daSSrinivas Pandruvada 17213081daSSrinivas PandruvadaThese capabilities are further enhanced in some of the newer generations of 18213081daSSrinivas Pandruvadaserver platforms where these features can be enumerated and controlled 19213081daSSrinivas Pandruvadadynamically without pre-configuring via BIOS setup options. This dynamic 20213081daSSrinivas Pandruvadaconfiguration is done via mailbox commands to the hardware. One way to enumerate 21213081daSSrinivas Pandruvadaand configure these features is by using the Intel Speed Select utility. 22213081daSSrinivas Pandruvada 23213081daSSrinivas PandruvadaThis document explains how to use the Intel Speed Select tool to enumerate and 24213081daSSrinivas Pandruvadacontrol Intel(R) SST features. This document gives example commands and explains 25213081daSSrinivas Pandruvadahow these commands change the power and performance profile of the system under 26213081daSSrinivas Pandruvadatest. Using this tool as an example, customers can replicate the messaging 27213081daSSrinivas Pandruvadaimplemented in the tool in their production software. 28213081daSSrinivas Pandruvada 29213081daSSrinivas Pandruvadaintel-speed-select configuration tool 30213081daSSrinivas Pandruvada====================================== 31213081daSSrinivas Pandruvada 32213081daSSrinivas PandruvadaMost Linux distribution packages may include the "intel-speed-select" tool. If not, 33213081daSSrinivas Pandruvadait can be built by downloading the Linux kernel tree from kernel.org. Once 34213081daSSrinivas Pandruvadadownloaded, the tool can be built without building the full kernel. 35213081daSSrinivas Pandruvada 36213081daSSrinivas PandruvadaFrom the kernel tree, run the following commands:: 37213081daSSrinivas Pandruvada 38213081daSSrinivas Pandruvada# cd tools/power/x86/intel-speed-select/ 39213081daSSrinivas Pandruvada# make 40213081daSSrinivas Pandruvada# make install 41213081daSSrinivas Pandruvada 42213081daSSrinivas PandruvadaGetting Help 43213081daSSrinivas Pandruvada------------ 44213081daSSrinivas Pandruvada 45213081daSSrinivas PandruvadaTo get help with the tool, execute the command below:: 46213081daSSrinivas Pandruvada 47213081daSSrinivas Pandruvada# intel-speed-select --help 48213081daSSrinivas Pandruvada 49213081daSSrinivas PandruvadaThe top-level help describes arguments and features. Notice that there is a 50213081daSSrinivas Pandruvadamulti-level help structure in the tool. For example, to get help for the feature "perf-profile":: 51213081daSSrinivas Pandruvada 52213081daSSrinivas Pandruvada# intel-speed-select perf-profile --help 53213081daSSrinivas Pandruvada 54213081daSSrinivas PandruvadaTo get help on a command, another level of help is provided. For example for the command info "info":: 55213081daSSrinivas Pandruvada 56213081daSSrinivas Pandruvada# intel-speed-select perf-profile info --help 57213081daSSrinivas Pandruvada 58213081daSSrinivas PandruvadaSummary of platform capability 59213081daSSrinivas Pandruvada------------------------------ 60751d5b27SAndrew KlychkovTo check the current platform and driver capabilities, execute:: 61213081daSSrinivas Pandruvada 62213081daSSrinivas Pandruvada#intel-speed-select --info 63213081daSSrinivas Pandruvada 64213081daSSrinivas PandruvadaFor example on a test system:: 65213081daSSrinivas Pandruvada 66213081daSSrinivas Pandruvada # intel-speed-select --info 67213081daSSrinivas Pandruvada Intel(R) Speed Select Technology 68213081daSSrinivas Pandruvada Executing on CPU model: X 69213081daSSrinivas Pandruvada Platform: API version : 1 70213081daSSrinivas Pandruvada Platform: Driver version : 1 71213081daSSrinivas Pandruvada Platform: mbox supported : 1 72213081daSSrinivas Pandruvada Platform: mmio supported : 1 73213081daSSrinivas Pandruvada Intel(R) SST-PP (feature perf-profile) is supported 74213081daSSrinivas Pandruvada TDP level change control is unlocked, max level: 4 75213081daSSrinivas Pandruvada Intel(R) SST-TF (feature turbo-freq) is supported 76213081daSSrinivas Pandruvada Intel(R) SST-BF (feature base-freq) is not supported 77213081daSSrinivas Pandruvada Intel(R) SST-CP (feature core-power) is supported 78213081daSSrinivas Pandruvada 79213081daSSrinivas PandruvadaIntel(R) Speed Select Technology - Performance Profile (Intel(R) SST-PP) 80213081daSSrinivas Pandruvada------------------------------------------------------------------------ 81213081daSSrinivas Pandruvada 82213081daSSrinivas PandruvadaThis feature allows configuration of a server dynamically based on workload 83213081daSSrinivas Pandruvadaperformance requirements. This helps users during deployment as they do not have 84213081daSSrinivas Pandruvadato choose a specific server configuration statically. This Intel(R) Speed Select 85213081daSSrinivas PandruvadaTechnology - Performance Profile (Intel(R) SST-PP) feature introduces a mechanism 86213081daSSrinivas Pandruvadathat allows multiple optimized performance profiles per system. Each profile 87213081daSSrinivas Pandruvadadefines a set of CPUs that need to be online and rest offline to sustain a 88213081daSSrinivas Pandruvadaguaranteed base frequency. Once the user issues a command to use a specific 89213081daSSrinivas Pandruvadaperformance profile and meet CPU online/offline requirement, the user can expect 90213081daSSrinivas Pandruvadaa change in the base frequency dynamically. This feature is called 91213081daSSrinivas Pandruvada"perf-profile" when using the Intel Speed Select tool. 92213081daSSrinivas Pandruvada 93213081daSSrinivas PandruvadaNumber or performance levels 94213081daSSrinivas Pandruvada~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 95213081daSSrinivas Pandruvada 96213081daSSrinivas PandruvadaThere can be multiple performance profiles on a system. To get the number of 97213081daSSrinivas Pandruvadaprofiles, execute the command below:: 98213081daSSrinivas Pandruvada 99213081daSSrinivas Pandruvada # intel-speed-select perf-profile get-config-levels 100213081daSSrinivas Pandruvada Intel(R) Speed Select Technology 101213081daSSrinivas Pandruvada Executing on CPU model: X 102213081daSSrinivas Pandruvada package-0 103213081daSSrinivas Pandruvada die-0 104213081daSSrinivas Pandruvada cpu-0 105213081daSSrinivas Pandruvada get-config-levels:4 106213081daSSrinivas Pandruvada package-1 107213081daSSrinivas Pandruvada die-0 108213081daSSrinivas Pandruvada cpu-14 109213081daSSrinivas Pandruvada get-config-levels:4 110213081daSSrinivas Pandruvada 111213081daSSrinivas PandruvadaOn this system under test, there are 4 performance profiles in addition to the 112213081daSSrinivas Pandruvadabase performance profile (which is performance level 0). 113213081daSSrinivas Pandruvada 114213081daSSrinivas PandruvadaLock/Unlock status 115213081daSSrinivas Pandruvada~~~~~~~~~~~~~~~~~~ 116213081daSSrinivas Pandruvada 117b45225b4SRandy DunlapEven if there are multiple performance profiles, it is possible that they 118213081daSSrinivas Pandruvadaare locked. If they are locked, users cannot issue a command to change the 119213081daSSrinivas Pandruvadaperformance state. It is possible that there is a BIOS setup to unlock or check 120213081daSSrinivas Pandruvadawith your system vendor. 121213081daSSrinivas Pandruvada 122213081daSSrinivas PandruvadaTo check if the system is locked, execute the following command:: 123213081daSSrinivas Pandruvada 124213081daSSrinivas Pandruvada # intel-speed-select perf-profile get-lock-status 125213081daSSrinivas Pandruvada Intel(R) Speed Select Technology 126213081daSSrinivas Pandruvada Executing on CPU model: X 127213081daSSrinivas Pandruvada package-0 128213081daSSrinivas Pandruvada die-0 129213081daSSrinivas Pandruvada cpu-0 130213081daSSrinivas Pandruvada get-lock-status:0 131213081daSSrinivas Pandruvada package-1 132213081daSSrinivas Pandruvada die-0 133213081daSSrinivas Pandruvada cpu-14 134213081daSSrinivas Pandruvada get-lock-status:0 135213081daSSrinivas Pandruvada 136213081daSSrinivas PandruvadaIn this case, lock status is 0, which means that the system is unlocked. 137213081daSSrinivas Pandruvada 138213081daSSrinivas PandruvadaProperties of a performance level 139213081daSSrinivas Pandruvada~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 140213081daSSrinivas Pandruvada 141213081daSSrinivas PandruvadaTo get properties of a specific performance level (For example for the level 0, below), execute the command below:: 142213081daSSrinivas Pandruvada 143213081daSSrinivas Pandruvada # intel-speed-select perf-profile info -l 0 144213081daSSrinivas Pandruvada Intel(R) Speed Select Technology 145213081daSSrinivas Pandruvada Executing on CPU model: X 146213081daSSrinivas Pandruvada package-0 147213081daSSrinivas Pandruvada die-0 148213081daSSrinivas Pandruvada cpu-0 149213081daSSrinivas Pandruvada perf-profile-level-0 150213081daSSrinivas Pandruvada cpu-count:28 151213081daSSrinivas Pandruvada enable-cpu-mask:000003ff,f0003fff 152213081daSSrinivas Pandruvada enable-cpu-list:0,1,2,3,4,5,6,7,8,9,10,11,12,13,28,29,30,31,32,33,34,35,36,37,38,39,40,41 153213081daSSrinivas Pandruvada thermal-design-power-ratio:26 154213081daSSrinivas Pandruvada base-frequency(MHz):2600 155213081daSSrinivas Pandruvada speed-select-turbo-freq:disabled 156213081daSSrinivas Pandruvada speed-select-base-freq:disabled 157213081daSSrinivas Pandruvada ... 158213081daSSrinivas Pandruvada ... 159213081daSSrinivas Pandruvada 160213081daSSrinivas PandruvadaHere -l option is used to specify a performance level. 161213081daSSrinivas Pandruvada 162213081daSSrinivas PandruvadaIf the option -l is omitted, then this command will print information about all 163213081daSSrinivas Pandruvadathe performance levels. The above command is printing properties of the 164213081daSSrinivas Pandruvadaperformance level 0. 165213081daSSrinivas Pandruvada 166213081daSSrinivas PandruvadaFor this performance profile, the list of CPUs displayed by the 167213081daSSrinivas Pandruvada"enable-cpu-mask/enable-cpu-list" at the max can be "online." When that 168213081daSSrinivas Pandruvadacondition is met, then base frequency of 2600 MHz can be maintained. To 169213081daSSrinivas Pandruvadaunderstand more, execute "intel-speed-select perf-profile info" for performance 170213081daSSrinivas Pandruvadalevel 4:: 171213081daSSrinivas Pandruvada 172213081daSSrinivas Pandruvada # intel-speed-select perf-profile info -l 4 173213081daSSrinivas Pandruvada Intel(R) Speed Select Technology 174213081daSSrinivas Pandruvada Executing on CPU model: X 175213081daSSrinivas Pandruvada package-0 176213081daSSrinivas Pandruvada die-0 177213081daSSrinivas Pandruvada cpu-0 178213081daSSrinivas Pandruvada perf-profile-level-4 179213081daSSrinivas Pandruvada cpu-count:28 180213081daSSrinivas Pandruvada enable-cpu-mask:000000fa,f0000faf 181213081daSSrinivas Pandruvada enable-cpu-list:0,1,2,3,5,7,8,9,10,11,28,29,30,31,33,35,36,37,38,39 182213081daSSrinivas Pandruvada thermal-design-power-ratio:28 183213081daSSrinivas Pandruvada base-frequency(MHz):2800 184213081daSSrinivas Pandruvada speed-select-turbo-freq:disabled 185213081daSSrinivas Pandruvada speed-select-base-freq:unsupported 186213081daSSrinivas Pandruvada ... 187213081daSSrinivas Pandruvada ... 188213081daSSrinivas Pandruvada 189213081daSSrinivas PandruvadaThere are fewer CPUs in the "enable-cpu-mask/enable-cpu-list". Consequently, if 190213081daSSrinivas Pandruvadathe user only keeps these CPUs online and the rest "offline," then the base 191213081daSSrinivas Pandruvadafrequency is increased to 2.8 GHz compared to 2.6 GHz at performance level 0. 192213081daSSrinivas Pandruvada 193213081daSSrinivas PandruvadaGet current performance level 194213081daSSrinivas Pandruvada~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 195213081daSSrinivas Pandruvada 196213081daSSrinivas PandruvadaTo get the current performance level, execute:: 197213081daSSrinivas Pandruvada 198213081daSSrinivas Pandruvada # intel-speed-select perf-profile get-config-current-level 199213081daSSrinivas Pandruvada Intel(R) Speed Select Technology 200213081daSSrinivas Pandruvada Executing on CPU model: X 201213081daSSrinivas Pandruvada package-0 202213081daSSrinivas Pandruvada die-0 203213081daSSrinivas Pandruvada cpu-0 204213081daSSrinivas Pandruvada get-config-current_level:0 205213081daSSrinivas Pandruvada 206213081daSSrinivas PandruvadaFirst verify that the base_frequency displayed by the cpufreq sysfs is correct:: 207213081daSSrinivas Pandruvada 208213081daSSrinivas Pandruvada # cat /sys/devices/system/cpu/cpu0/cpufreq/base_frequency 209213081daSSrinivas Pandruvada 2600000 210213081daSSrinivas Pandruvada 211213081daSSrinivas PandruvadaThis matches the base-frequency (MHz) field value displayed from the 212213081daSSrinivas Pandruvada"perf-profile info" command for performance level 0(cpufreq frequency is in 213213081daSSrinivas PandruvadaKHz). 214213081daSSrinivas Pandruvada 215213081daSSrinivas PandruvadaTo check if the average frequency is equal to the base frequency for a 100% busy 216213081daSSrinivas Pandruvadaworkload, disable turbo:: 217213081daSSrinivas Pandruvada 218213081daSSrinivas Pandruvada# echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo 219213081daSSrinivas Pandruvada 220213081daSSrinivas PandruvadaThen runs a busy workload on all CPUs, for example:: 221213081daSSrinivas Pandruvada 222213081daSSrinivas Pandruvada#stress -c 64 223213081daSSrinivas Pandruvada 224213081daSSrinivas PandruvadaTo verify the base frequency, run turbostat:: 225213081daSSrinivas Pandruvada 226213081daSSrinivas Pandruvada #turbostat -c 0-13 --show Package,Core,CPU,Bzy_MHz -i 1 227213081daSSrinivas Pandruvada 228213081daSSrinivas Pandruvada Package Core CPU Bzy_MHz 229213081daSSrinivas Pandruvada - - 2600 230213081daSSrinivas Pandruvada 0 0 0 2600 231213081daSSrinivas Pandruvada 0 1 1 2600 232213081daSSrinivas Pandruvada 0 2 2 2600 233213081daSSrinivas Pandruvada 0 3 3 2600 234213081daSSrinivas Pandruvada 0 4 4 2600 235213081daSSrinivas Pandruvada . . . . 236213081daSSrinivas Pandruvada 237213081daSSrinivas Pandruvada 238213081daSSrinivas PandruvadaChanging performance level 239213081daSSrinivas Pandruvada~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 240213081daSSrinivas Pandruvada 241213081daSSrinivas PandruvadaTo the change the performance level to 4, execute:: 242213081daSSrinivas Pandruvada 243213081daSSrinivas Pandruvada # intel-speed-select -d perf-profile set-config-level -l 4 -o 244213081daSSrinivas Pandruvada Intel(R) Speed Select Technology 245213081daSSrinivas Pandruvada Executing on CPU model: X 246213081daSSrinivas Pandruvada package-0 247213081daSSrinivas Pandruvada die-0 248213081daSSrinivas Pandruvada cpu-0 249213081daSSrinivas Pandruvada perf-profile 250213081daSSrinivas Pandruvada set_tdp_level:success 251213081daSSrinivas Pandruvada 252213081daSSrinivas PandruvadaIn the command above, "-o" is optional. If it is specified, then it will also 253213081daSSrinivas Pandruvadaoffline CPUs which are not present in the enable_cpu_mask for this performance 254213081daSSrinivas Pandruvadalevel. 255213081daSSrinivas Pandruvada 256213081daSSrinivas PandruvadaNow if the base_frequency is checked:: 257213081daSSrinivas Pandruvada 258213081daSSrinivas Pandruvada #cat /sys/devices/system/cpu/cpu0/cpufreq/base_frequency 259213081daSSrinivas Pandruvada 2800000 260213081daSSrinivas Pandruvada 261213081daSSrinivas PandruvadaWhich shows that the base frequency now increased from 2600 MHz at performance 262213081daSSrinivas Pandruvadalevel 0 to 2800 MHz at performance level 4. As a result, any workload, which can 263213081daSSrinivas Pandruvadause fewer CPUs, can see a boost of 200 MHz compared to performance level 0. 264213081daSSrinivas Pandruvada 265*4fe4f155SSrinivas PandruvadaChanging performance level via BMC Interface 266*4fe4f155SSrinivas Pandruvada~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 267*4fe4f155SSrinivas Pandruvada 268*4fe4f155SSrinivas PandruvadaIt is possible to change SST-PP level using out of band (OOB) agent (Via some 269*4fe4f155SSrinivas Pandruvadaremote management console, through BMC "Baseboard Management Controller" 270*4fe4f155SSrinivas Pandruvadainterface). This mode is supported from the Sapphire Rapids processor 271*4fe4f155SSrinivas Pandruvadageneration. The kernel and tool change to support this mode is added to Linux 272*4fe4f155SSrinivas Pandruvadakernel version 5.18. To enable this feature, kernel config 273*4fe4f155SSrinivas Pandruvada"CONFIG_INTEL_HFI_THERMAL" is required. The minimum version of the tool 274*4fe4f155SSrinivas Pandruvadais "v1.12" to support this feature, which is part of Linux kernel version 5.18. 275*4fe4f155SSrinivas Pandruvada 276*4fe4f155SSrinivas PandruvadaTo support such configuration, this tool can be used as a daemon. Add 277*4fe4f155SSrinivas Pandruvadaa command line option --oob:: 278*4fe4f155SSrinivas Pandruvada 279*4fe4f155SSrinivas Pandruvada # intel-speed-select --oob 280*4fe4f155SSrinivas Pandruvada Intel(R) Speed Select Technology 281*4fe4f155SSrinivas Pandruvada Executing on CPU model:143[0x8f] 282*4fe4f155SSrinivas Pandruvada OOB mode is enabled and will run as daemon 283*4fe4f155SSrinivas Pandruvada 284*4fe4f155SSrinivas PandruvadaIn this mode the tool will online/offline CPUs based on the new performance 285*4fe4f155SSrinivas Pandruvadalevel. 286*4fe4f155SSrinivas Pandruvada 287213081daSSrinivas PandruvadaCheck presence of other Intel(R) SST features 288213081daSSrinivas Pandruvada--------------------------------------------- 289213081daSSrinivas Pandruvada 290213081daSSrinivas PandruvadaEach of the performance profiles also specifies weather there is support of 291213081daSSrinivas Pandruvadaother two Intel(R) SST features (Intel(R) Speed Select Technology - Base Frequency 292213081daSSrinivas Pandruvada(Intel(R) SST-BF) and Intel(R) Speed Select Technology - Turbo Frequency (Intel 293213081daSSrinivas PandruvadaSST-TF)). 294213081daSSrinivas Pandruvada 295213081daSSrinivas PandruvadaFor example, from the output of "perf-profile info" above, for level 0 and level 296213081daSSrinivas Pandruvada4: 297213081daSSrinivas Pandruvada 298213081daSSrinivas PandruvadaFor level 0:: 299213081daSSrinivas Pandruvada speed-select-turbo-freq:disabled 300213081daSSrinivas Pandruvada speed-select-base-freq:disabled 301213081daSSrinivas Pandruvada 302213081daSSrinivas PandruvadaFor level 4:: 303213081daSSrinivas Pandruvada speed-select-turbo-freq:disabled 304213081daSSrinivas Pandruvada speed-select-base-freq:unsupported 305213081daSSrinivas Pandruvada 306213081daSSrinivas PandruvadaGiven these results, the "speed-select-base-freq" (Intel(R) SST-BF) in level 4 307213081daSSrinivas Pandruvadachanged from "disabled" to "unsupported" compared to performance level 0. 308213081daSSrinivas Pandruvada 309213081daSSrinivas PandruvadaThis means that at performance level 4, the "speed-select-base-freq" feature is 310213081daSSrinivas Pandruvadanot supported. However, at performance level 0, this feature is "supported", but 311213081daSSrinivas Pandruvadacurrently "disabled", meaning the user has not activated this feature. Whereas 312213081daSSrinivas Pandruvada"speed-select-turbo-freq" (Intel(R) SST-TF) is supported at both performance 313213081daSSrinivas Pandruvadalevels, but currently not activated by the user. 314213081daSSrinivas Pandruvada 315213081daSSrinivas PandruvadaThe Intel(R) SST-BF and the Intel(R) SST-TF features are built on a foundation 316213081daSSrinivas Pandruvadatechnology called Intel(R) Speed Select Technology - Core Power (Intel(R) SST-CP). 317213081daSSrinivas PandruvadaThe platform firmware enables this feature when Intel(R) SST-BF or Intel(R) SST-TF 318213081daSSrinivas Pandruvadais supported on a platform. 319213081daSSrinivas Pandruvada 320213081daSSrinivas PandruvadaIntel(R) Speed Select Technology Core Power (Intel(R) SST-CP) 321213081daSSrinivas Pandruvada--------------------------------------------------------------- 322213081daSSrinivas Pandruvada 323213081daSSrinivas PandruvadaIntel(R) Speed Select Technology Core Power (Intel(R) SST-CP) is an interface that 324213081daSSrinivas Pandruvadaallows users to define per core priority. This defines a mechanism to distribute 325213081daSSrinivas Pandruvadapower among cores when there is a power constrained scenario. This defines a 326213081daSSrinivas Pandruvadaclass of service (CLOS) configuration. 327213081daSSrinivas Pandruvada 328213081daSSrinivas PandruvadaThe user can configure up to 4 class of service configurations. Each CLOS group 329213081daSSrinivas Pandruvadaconfiguration allows definitions of parameters, which affects how the frequency 330213081daSSrinivas Pandruvadacan be limited and power is distributed. Each CPU core can be tied to a class of 331213081daSSrinivas Pandruvadaservice and hence an associated priority. The granularity is at core level not 332213081daSSrinivas Pandruvadaat per CPU level. 333213081daSSrinivas Pandruvada 334213081daSSrinivas PandruvadaEnable CLOS based prioritization 335213081daSSrinivas Pandruvada~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 336213081daSSrinivas Pandruvada 337213081daSSrinivas PandruvadaTo use CLOS based prioritization feature, firmware must be informed to enable 338213081daSSrinivas Pandruvadaand use a priority type. There is a default per platform priority type, which 339213081daSSrinivas Pandruvadacan be changed with optional command line parameter. 340213081daSSrinivas Pandruvada 341213081daSSrinivas PandruvadaTo enable and check the options, execute:: 342213081daSSrinivas Pandruvada 343213081daSSrinivas Pandruvada # intel-speed-select core-power enable --help 344213081daSSrinivas Pandruvada Intel(R) Speed Select Technology 345213081daSSrinivas Pandruvada Executing on CPU model: X 346213081daSSrinivas Pandruvada Enable core-power for a package/die 347213081daSSrinivas Pandruvada Clos Enable: Specify priority type with [--priority|-p] 348213081daSSrinivas Pandruvada 0: Proportional, 1: Ordered 349213081daSSrinivas Pandruvada 350213081daSSrinivas PandruvadaThere are two types of priority types: 351213081daSSrinivas Pandruvada 352213081daSSrinivas Pandruvada- Ordered 353213081daSSrinivas Pandruvada 354213081daSSrinivas PandruvadaPriority for ordered throttling is defined based on the index of the assigned 355213081daSSrinivas PandruvadaCLOS group. Where CLOS0 gets highest priority (throttled last). 356213081daSSrinivas Pandruvada 357213081daSSrinivas PandruvadaPriority order is: 358213081daSSrinivas PandruvadaCLOS0 > CLOS1 > CLOS2 > CLOS3. 359213081daSSrinivas Pandruvada 360213081daSSrinivas Pandruvada- Proportional 361213081daSSrinivas Pandruvada 362213081daSSrinivas PandruvadaWhen proportional priority is used, there is an additional parameter called 363213081daSSrinivas Pandruvadafrequency_weight, which can be specified per CLOS group. The goal of 364213081daSSrinivas Pandruvadaproportional priority is to provide each core with the requested min., then 365213081daSSrinivas Pandruvadadistribute all remaining (excess/deficit) budgets in proportion to a defined 366213081daSSrinivas Pandruvadaweight. This proportional priority can be configured using "core-power config" 367213081daSSrinivas Pandruvadacommand. 368213081daSSrinivas Pandruvada 369213081daSSrinivas PandruvadaTo enable with the platform default priority type, execute:: 370213081daSSrinivas Pandruvada 371213081daSSrinivas Pandruvada # intel-speed-select core-power enable 372213081daSSrinivas Pandruvada Intel(R) Speed Select Technology 373213081daSSrinivas Pandruvada Executing on CPU model: X 374213081daSSrinivas Pandruvada package-0 375213081daSSrinivas Pandruvada die-0 376213081daSSrinivas Pandruvada cpu-0 377213081daSSrinivas Pandruvada core-power 378213081daSSrinivas Pandruvada enable:success 379213081daSSrinivas Pandruvada package-1 380213081daSSrinivas Pandruvada die-0 381213081daSSrinivas Pandruvada cpu-6 382213081daSSrinivas Pandruvada core-power 383213081daSSrinivas Pandruvada enable:success 384213081daSSrinivas Pandruvada 385213081daSSrinivas PandruvadaThe scope of this enable is per package or die scoped when a package contains 386213081daSSrinivas Pandruvadamultiple dies. To check if CLOS is enabled and get priority type, "core-power 387213081daSSrinivas Pandruvadainfo" command can be used. For example to check the status of core-power feature 388213081daSSrinivas Pandruvadaon CPU 0, execute:: 389213081daSSrinivas Pandruvada 390213081daSSrinivas Pandruvada # intel-speed-select -c 0 core-power info 391213081daSSrinivas Pandruvada Intel(R) Speed Select Technology 392213081daSSrinivas Pandruvada Executing on CPU model: X 393213081daSSrinivas Pandruvada package-0 394213081daSSrinivas Pandruvada die-0 395213081daSSrinivas Pandruvada cpu-0 396213081daSSrinivas Pandruvada core-power 397213081daSSrinivas Pandruvada support-status:supported 398213081daSSrinivas Pandruvada enable-status:enabled 399213081daSSrinivas Pandruvada clos-enable-status:enabled 400213081daSSrinivas Pandruvada priority-type:proportional 401213081daSSrinivas Pandruvada package-1 402213081daSSrinivas Pandruvada die-0 403213081daSSrinivas Pandruvada cpu-24 404213081daSSrinivas Pandruvada core-power 405213081daSSrinivas Pandruvada support-status:supported 406213081daSSrinivas Pandruvada enable-status:enabled 407213081daSSrinivas Pandruvada clos-enable-status:enabled 408213081daSSrinivas Pandruvada priority-type:proportional 409213081daSSrinivas Pandruvada 410213081daSSrinivas PandruvadaConfiguring CLOS groups 411213081daSSrinivas Pandruvada~~~~~~~~~~~~~~~~~~~~~~~ 412213081daSSrinivas Pandruvada 413213081daSSrinivas PandruvadaEach CLOS group has its own attributes including min, max, freq_weight and 414213081daSSrinivas Pandruvadadesired. These parameters can be configured with "core-power config" command. 415213081daSSrinivas PandruvadaDefaults will be used if user skips setting a parameter except clos id, which is 416213081daSSrinivas Pandruvadamandatory. To check core-power config options, execute:: 417213081daSSrinivas Pandruvada 418213081daSSrinivas Pandruvada # intel-speed-select core-power config --help 419213081daSSrinivas Pandruvada Intel(R) Speed Select Technology 420213081daSSrinivas Pandruvada Executing on CPU model: X 421213081daSSrinivas Pandruvada Set core-power configuration for one of the four clos ids 422213081daSSrinivas Pandruvada Specify targeted clos id with [--clos|-c] 423213081daSSrinivas Pandruvada Specify clos Proportional Priority [--weight|-w] 424213081daSSrinivas Pandruvada Specify clos min in MHz with [--min|-n] 425213081daSSrinivas Pandruvada Specify clos max in MHz with [--max|-m] 426213081daSSrinivas Pandruvada 427213081daSSrinivas PandruvadaFor example:: 428213081daSSrinivas Pandruvada 429213081daSSrinivas Pandruvada # intel-speed-select core-power config -c 0 430213081daSSrinivas Pandruvada Intel(R) Speed Select Technology 431213081daSSrinivas Pandruvada Executing on CPU model: X 432213081daSSrinivas Pandruvada clos epp is not specified, default: 0 433213081daSSrinivas Pandruvada clos frequency weight is not specified, default: 0 434213081daSSrinivas Pandruvada clos min is not specified, default: 0 MHz 435213081daSSrinivas Pandruvada clos max is not specified, default: 25500 MHz 436213081daSSrinivas Pandruvada clos desired is not specified, default: 0 437213081daSSrinivas Pandruvada package-0 438213081daSSrinivas Pandruvada die-0 439213081daSSrinivas Pandruvada cpu-0 440213081daSSrinivas Pandruvada core-power 441213081daSSrinivas Pandruvada config:success 442213081daSSrinivas Pandruvada package-1 443213081daSSrinivas Pandruvada die-0 444213081daSSrinivas Pandruvada cpu-6 445213081daSSrinivas Pandruvada core-power 446213081daSSrinivas Pandruvada config:success 447213081daSSrinivas Pandruvada 448213081daSSrinivas PandruvadaThe user has the option to change defaults. For example, the user can change the 449213081daSSrinivas Pandruvada"min" and set the base frequency to always get guaranteed base frequency. 450213081daSSrinivas Pandruvada 451213081daSSrinivas PandruvadaGet the current CLOS configuration 452213081daSSrinivas Pandruvada~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 453213081daSSrinivas Pandruvada 454213081daSSrinivas PandruvadaTo check the current configuration, "core-power get-config" can be used. For 455213081daSSrinivas Pandruvadaexample, to get the configuration of CLOS 0:: 456213081daSSrinivas Pandruvada 457213081daSSrinivas Pandruvada # intel-speed-select core-power get-config -c 0 458213081daSSrinivas Pandruvada Intel(R) Speed Select Technology 459213081daSSrinivas Pandruvada Executing on CPU model: X 460213081daSSrinivas Pandruvada package-0 461213081daSSrinivas Pandruvada die-0 462213081daSSrinivas Pandruvada cpu-0 463213081daSSrinivas Pandruvada core-power 464213081daSSrinivas Pandruvada clos:0 465213081daSSrinivas Pandruvada epp:0 466213081daSSrinivas Pandruvada clos-proportional-priority:0 467213081daSSrinivas Pandruvada clos-min:0 MHz 468213081daSSrinivas Pandruvada clos-max:Max Turbo frequency 469213081daSSrinivas Pandruvada clos-desired:0 MHz 470213081daSSrinivas Pandruvada package-1 471213081daSSrinivas Pandruvada die-0 472213081daSSrinivas Pandruvada cpu-24 473213081daSSrinivas Pandruvada core-power 474213081daSSrinivas Pandruvada clos:0 475213081daSSrinivas Pandruvada epp:0 476213081daSSrinivas Pandruvada clos-proportional-priority:0 477213081daSSrinivas Pandruvada clos-min:0 MHz 478213081daSSrinivas Pandruvada clos-max:Max Turbo frequency 479213081daSSrinivas Pandruvada clos-desired:0 MHz 480213081daSSrinivas Pandruvada 481213081daSSrinivas PandruvadaAssociating a CPU with a CLOS group 482213081daSSrinivas Pandruvada~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 483213081daSSrinivas Pandruvada 484213081daSSrinivas PandruvadaTo associate a CPU to a CLOS group "core-power assoc" command can be used:: 485213081daSSrinivas Pandruvada 486213081daSSrinivas Pandruvada # intel-speed-select core-power assoc --help 487213081daSSrinivas Pandruvada Intel(R) Speed Select Technology 488213081daSSrinivas Pandruvada Executing on CPU model: X 489213081daSSrinivas Pandruvada Associate a clos id to a CPU 490213081daSSrinivas Pandruvada Specify targeted clos id with [--clos|-c] 491213081daSSrinivas Pandruvada 492213081daSSrinivas Pandruvada 493213081daSSrinivas PandruvadaFor example to associate CPU 10 to CLOS group 3, execute:: 494213081daSSrinivas Pandruvada 495213081daSSrinivas Pandruvada # intel-speed-select -c 10 core-power assoc -c 3 496213081daSSrinivas Pandruvada Intel(R) Speed Select Technology 497213081daSSrinivas Pandruvada Executing on CPU model: X 498213081daSSrinivas Pandruvada package-0 499213081daSSrinivas Pandruvada die-0 500213081daSSrinivas Pandruvada cpu-10 501213081daSSrinivas Pandruvada core-power 502213081daSSrinivas Pandruvada assoc:success 503213081daSSrinivas Pandruvada 504213081daSSrinivas PandruvadaOnce a CPU is associated, its sibling CPUs are also associated to a CLOS group. 505213081daSSrinivas PandruvadaOnce associated, avoid changing Linux "cpufreq" subsystem scaling frequency 506213081daSSrinivas Pandruvadalimits. 507213081daSSrinivas Pandruvada 508213081daSSrinivas PandruvadaTo check the existing association for a CPU, "core-power get-assoc" command can 509213081daSSrinivas Pandruvadabe used. For example, to get association of CPU 10, execute:: 510213081daSSrinivas Pandruvada 511213081daSSrinivas Pandruvada # intel-speed-select -c 10 core-power get-assoc 512213081daSSrinivas Pandruvada Intel(R) Speed Select Technology 513213081daSSrinivas Pandruvada Executing on CPU model: X 514213081daSSrinivas Pandruvada package-1 515213081daSSrinivas Pandruvada die-0 516213081daSSrinivas Pandruvada cpu-10 517213081daSSrinivas Pandruvada get-assoc 518213081daSSrinivas Pandruvada clos:3 519213081daSSrinivas Pandruvada 520213081daSSrinivas PandruvadaThis shows that CPU 10 is part of a CLOS group 3. 521213081daSSrinivas Pandruvada 522213081daSSrinivas Pandruvada 523213081daSSrinivas PandruvadaDisable CLOS based prioritization 524213081daSSrinivas Pandruvada~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 525213081daSSrinivas Pandruvada 526213081daSSrinivas PandruvadaTo disable, execute:: 527213081daSSrinivas Pandruvada 528213081daSSrinivas Pandruvada# intel-speed-select core-power disable 529213081daSSrinivas Pandruvada 530213081daSSrinivas PandruvadaSome features like Intel(R) SST-TF can only be enabled when CLOS based prioritization 531213081daSSrinivas Pandruvadais enabled. For this reason, disabling while Intel(R) SST-TF is enabled can cause 532213081daSSrinivas PandruvadaIntel(R) SST-TF to fail. This will cause the "disable" command to display an error 533213081daSSrinivas Pandruvadaif Intel(R) SST-TF is already enabled. In turn, to disable, the Intel(R) SST-TF 534213081daSSrinivas Pandruvadafeature must be disabled first. 535213081daSSrinivas Pandruvada 536213081daSSrinivas PandruvadaIntel(R) Speed Select Technology - Base Frequency (Intel(R) SST-BF) 537213081daSSrinivas Pandruvada------------------------------------------------------------------- 538213081daSSrinivas Pandruvada 539213081daSSrinivas PandruvadaThe Intel(R) Speed Select Technology - Base Frequency (Intel(R) SST-BF) feature lets 540213081daSSrinivas Pandruvadathe user control base frequency. If some critical workload threads demand 541213081daSSrinivas Pandruvadaconstant high guaranteed performance, then this feature can be used to execute 542213081daSSrinivas Pandruvadathe thread at higher base frequency on specific sets of CPUs (high priority 543213081daSSrinivas PandruvadaCPUs) at the cost of lower base frequency (low priority CPUs) on other CPUs. 544213081daSSrinivas PandruvadaThis feature does not require offline of the low priority CPUs. 545213081daSSrinivas Pandruvada 546213081daSSrinivas PandruvadaThe support of Intel(R) SST-BF depends on the Intel(R) Speed Select Technology - 547213081daSSrinivas PandruvadaPerformance Profile (Intel(R) SST-PP) performance level configuration. It is 548213081daSSrinivas Pandruvadapossible that only certain performance levels support Intel(R) SST-BF. It is also 549213081daSSrinivas Pandruvadapossible that only base performance level (level = 0) has support of Intel 550213081daSSrinivas PandruvadaSST-BF. Consequently, first select the desired performance level to enable this 551213081daSSrinivas Pandruvadafeature. 552213081daSSrinivas Pandruvada 553213081daSSrinivas PandruvadaIn the system under test here, Intel(R) SST-BF is supported at the base 554213081daSSrinivas Pandruvadaperformance level 0, but currently disabled. For example for the level 0:: 555213081daSSrinivas Pandruvada 556213081daSSrinivas Pandruvada # intel-speed-select -c 0 perf-profile info -l 0 557213081daSSrinivas Pandruvada Intel(R) Speed Select Technology 558213081daSSrinivas Pandruvada Executing on CPU model: X 559213081daSSrinivas Pandruvada package-0 560213081daSSrinivas Pandruvada die-0 561213081daSSrinivas Pandruvada cpu-0 562213081daSSrinivas Pandruvada perf-profile-level-0 563213081daSSrinivas Pandruvada ... 564213081daSSrinivas Pandruvada 565213081daSSrinivas Pandruvada speed-select-base-freq:disabled 566213081daSSrinivas Pandruvada ... 567213081daSSrinivas Pandruvada 568213081daSSrinivas PandruvadaBefore enabling Intel(R) SST-BF and measuring its impact on a workload 569213081daSSrinivas Pandruvadaperformance, execute some workload and measure performance and get a baseline 570213081daSSrinivas Pandruvadaperformance to compare against. 571213081daSSrinivas Pandruvada 572213081daSSrinivas PandruvadaHere the user wants more guaranteed performance. For this reason, it is likely 573213081daSSrinivas Pandruvadathat turbo is disabled. To disable turbo, execute:: 574213081daSSrinivas Pandruvada 575213081daSSrinivas Pandruvada#echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo 576213081daSSrinivas Pandruvada 577213081daSSrinivas PandruvadaBased on the output of the "intel-speed-select perf-profile info -l 0" base 578213081daSSrinivas Pandruvadafrequency of guaranteed frequency 2600 MHz. 579213081daSSrinivas Pandruvada 580213081daSSrinivas Pandruvada 581213081daSSrinivas PandruvadaMeasure baseline performance for comparison 582213081daSSrinivas Pandruvada~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 583213081daSSrinivas Pandruvada 584213081daSSrinivas PandruvadaTo compare, pick a multi-threaded workload where each thread can be scheduled on 585213081daSSrinivas Pandruvadaseparate CPUs. "Hackbench pipe" test is a good example on how to improve 586213081daSSrinivas Pandruvadaperformance using Intel(R) SST-BF. 587213081daSSrinivas Pandruvada 588213081daSSrinivas PandruvadaBelow, the workload is measuring average scheduler wakeup latency, so a lower 589213081daSSrinivas Pandruvadanumber means better performance:: 590213081daSSrinivas Pandruvada 591213081daSSrinivas Pandruvada # taskset -c 3,4 perf bench -r 100 sched pipe 592213081daSSrinivas Pandruvada # Running 'sched/pipe' benchmark: 593213081daSSrinivas Pandruvada # Executed 1000000 pipe operations between two processes 594213081daSSrinivas Pandruvada Total time: 6.102 [sec] 595213081daSSrinivas Pandruvada 6.102445 usecs/op 596213081daSSrinivas Pandruvada 163868 ops/sec 597213081daSSrinivas Pandruvada 598213081daSSrinivas PandruvadaWhile running the above test, if we take turbostat output, it will show us that 599213081daSSrinivas Pandruvada2 of the CPUs are busy and reaching max. frequency (which would be the base 600213081daSSrinivas Pandruvadafrequency as the turbo is disabled). The turbostat output:: 601213081daSSrinivas Pandruvada 602213081daSSrinivas Pandruvada #turbostat -c 0-13 --show Package,Core,CPU,Bzy_MHz -i 1 603213081daSSrinivas Pandruvada Package Core CPU Bzy_MHz 604213081daSSrinivas Pandruvada 0 0 0 1000 605213081daSSrinivas Pandruvada 0 1 1 1005 606213081daSSrinivas Pandruvada 0 2 2 1000 607213081daSSrinivas Pandruvada 0 3 3 2600 608213081daSSrinivas Pandruvada 0 4 4 2600 609213081daSSrinivas Pandruvada 0 5 5 1000 610213081daSSrinivas Pandruvada 0 6 6 1000 611213081daSSrinivas Pandruvada 0 7 7 1005 612213081daSSrinivas Pandruvada 0 8 8 1005 613213081daSSrinivas Pandruvada 0 9 9 1000 614213081daSSrinivas Pandruvada 0 10 10 1000 615213081daSSrinivas Pandruvada 0 11 11 995 616213081daSSrinivas Pandruvada 0 12 12 1000 617213081daSSrinivas Pandruvada 0 13 13 1000 618213081daSSrinivas Pandruvada 619213081daSSrinivas PandruvadaFrom the above turbostat output, both CPU 3 and 4 are very busy and reaching 620213081daSSrinivas Pandruvadafull guaranteed frequency of 2600 MHz. 621213081daSSrinivas Pandruvada 622213081daSSrinivas PandruvadaIntel(R) SST-BF Capabilities 623213081daSSrinivas Pandruvada~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 624213081daSSrinivas Pandruvada 625213081daSSrinivas PandruvadaTo get capabilities of Intel(R) SST-BF for the current performance level 0, 626213081daSSrinivas Pandruvadaexecute:: 627213081daSSrinivas Pandruvada 628213081daSSrinivas Pandruvada # intel-speed-select base-freq info -l 0 629213081daSSrinivas Pandruvada Intel(R) Speed Select Technology 630213081daSSrinivas Pandruvada Executing on CPU model: X 631213081daSSrinivas Pandruvada package-0 632213081daSSrinivas Pandruvada die-0 633213081daSSrinivas Pandruvada cpu-0 634213081daSSrinivas Pandruvada speed-select-base-freq 635213081daSSrinivas Pandruvada high-priority-base-frequency(MHz):3000 636213081daSSrinivas Pandruvada high-priority-cpu-mask:00000216,00002160 637213081daSSrinivas Pandruvada high-priority-cpu-list:5,6,8,13,33,34,36,41 638213081daSSrinivas Pandruvada low-priority-base-frequency(MHz):2400 639213081daSSrinivas Pandruvada tjunction-temperature(C):125 640213081daSSrinivas Pandruvada thermal-design-power(W):205 641213081daSSrinivas Pandruvada 642213081daSSrinivas PandruvadaThe above capabilities show that there are some CPUs on this system that can 643213081daSSrinivas Pandruvadaoffer base frequency of 3000 MHz compared to the standard base frequency at this 644213081daSSrinivas Pandruvadaperformance levels. Nevertheless, these CPUs are fixed, and they are presented 645213081daSSrinivas Pandruvadavia high-priority-cpu-list/high-priority-cpu-mask. But if this Intel(R) SST-BF 646213081daSSrinivas Pandruvadafeature is selected, the low priorities CPUs (which are not in 647213081daSSrinivas Pandruvadahigh-priority-cpu-list) can only offer up to 2400 MHz. As a result, if this 648213081daSSrinivas Pandruvadaclipping of low priority CPUs is acceptable, then the user can enable Intel 649213081daSSrinivas PandruvadaSST-BF feature particularly for the above "sched pipe" workload since only two 650213081daSSrinivas PandruvadaCPUs are used, they can be scheduled on high priority CPUs and can get boost of 651213081daSSrinivas Pandruvada400 MHz. 652213081daSSrinivas Pandruvada 653213081daSSrinivas PandruvadaEnable Intel(R) SST-BF 654213081daSSrinivas Pandruvada~~~~~~~~~~~~~~~~~~~~~~ 655213081daSSrinivas Pandruvada 656213081daSSrinivas PandruvadaTo enable Intel(R) SST-BF feature, execute:: 657213081daSSrinivas Pandruvada 658213081daSSrinivas Pandruvada # intel-speed-select base-freq enable -a 659213081daSSrinivas Pandruvada Intel(R) Speed Select Technology 660213081daSSrinivas Pandruvada Executing on CPU model: X 661213081daSSrinivas Pandruvada package-0 662213081daSSrinivas Pandruvada die-0 663213081daSSrinivas Pandruvada cpu-0 664213081daSSrinivas Pandruvada base-freq 665213081daSSrinivas Pandruvada enable:success 666213081daSSrinivas Pandruvada package-1 667213081daSSrinivas Pandruvada die-0 668213081daSSrinivas Pandruvada cpu-14 669213081daSSrinivas Pandruvada base-freq 670213081daSSrinivas Pandruvada enable:success 671213081daSSrinivas Pandruvada 672213081daSSrinivas PandruvadaIn this case, -a option is optional. This not only enables Intel(R) SST-BF, but it 673213081daSSrinivas Pandruvadaalso adjusts the priority of cores using Intel(R) Speed Select Technology Core 674213081daSSrinivas PandruvadaPower (Intel(R) SST-CP) features. This option sets the minimum performance of each 675213081daSSrinivas PandruvadaIntel(R) Speed Select Technology - Performance Profile (Intel(R) SST-PP) class to 676213081daSSrinivas Pandruvadamaximum performance so that the hardware will give maximum performance possible 677213081daSSrinivas Pandruvadafor each CPU. 678213081daSSrinivas Pandruvada 679213081daSSrinivas PandruvadaIf -a option is not used, then the following steps are required before enabling 680213081daSSrinivas PandruvadaIntel(R) SST-BF: 681213081daSSrinivas Pandruvada 682213081daSSrinivas Pandruvada- Discover Intel(R) SST-BF and note low and high priority base frequency 683751d5b27SAndrew Klychkov- Note the high priority CPU list 684213081daSSrinivas Pandruvada- Enable CLOS using core-power feature set 685213081daSSrinivas Pandruvada- Configure CLOS parameters. Use CLOS.min to set to minimum performance 686213081daSSrinivas Pandruvada- Subscribe desired CPUs to CLOS groups 687213081daSSrinivas Pandruvada 688213081daSSrinivas PandruvadaWith this configuration, if the same workload is executed by pinning the 689213081daSSrinivas Pandruvadaworkload to high priority CPUs (CPU 5 and 6 in this case):: 690213081daSSrinivas Pandruvada 691213081daSSrinivas Pandruvada #taskset -c 5,6 perf bench -r 100 sched pipe 692213081daSSrinivas Pandruvada # Running 'sched/pipe' benchmark: 693213081daSSrinivas Pandruvada # Executed 1000000 pipe operations between two processes 694213081daSSrinivas Pandruvada Total time: 5.627 [sec] 695213081daSSrinivas Pandruvada 5.627922 usecs/op 696213081daSSrinivas Pandruvada 177685 ops/sec 697213081daSSrinivas Pandruvada 698213081daSSrinivas PandruvadaThis way, by enabling Intel(R) SST-BF, the performance of this benchmark is 699213081daSSrinivas Pandruvadaimproved (latency reduced) by 7.79%. From the turbostat output, it can be 700213081daSSrinivas Pandruvadaobserved that the high priority CPUs reached 3000 MHz compared to 2600 MHz. 701213081daSSrinivas PandruvadaThe turbostat output:: 702213081daSSrinivas Pandruvada 703213081daSSrinivas Pandruvada #turbostat -c 0-13 --show Package,Core,CPU,Bzy_MHz -i 1 704213081daSSrinivas Pandruvada Package Core CPU Bzy_MHz 705213081daSSrinivas Pandruvada 0 0 0 2151 706213081daSSrinivas Pandruvada 0 1 1 2166 707213081daSSrinivas Pandruvada 0 2 2 2175 708213081daSSrinivas Pandruvada 0 3 3 2175 709213081daSSrinivas Pandruvada 0 4 4 2175 710213081daSSrinivas Pandruvada 0 5 5 3000 711213081daSSrinivas Pandruvada 0 6 6 3000 712213081daSSrinivas Pandruvada 0 7 7 2180 713213081daSSrinivas Pandruvada 0 8 8 2662 714213081daSSrinivas Pandruvada 0 9 9 2176 715213081daSSrinivas Pandruvada 0 10 10 2175 716213081daSSrinivas Pandruvada 0 11 11 2176 717213081daSSrinivas Pandruvada 0 12 12 2176 718213081daSSrinivas Pandruvada 0 13 13 2661 719213081daSSrinivas Pandruvada 720213081daSSrinivas PandruvadaDisable Intel(R) SST-BF 721213081daSSrinivas Pandruvada~~~~~~~~~~~~~~~~~~~~~~~ 722213081daSSrinivas Pandruvada 723213081daSSrinivas PandruvadaTo disable the Intel(R) SST-BF feature, execute:: 724213081daSSrinivas Pandruvada 725213081daSSrinivas Pandruvada# intel-speed-select base-freq disable -a 726213081daSSrinivas Pandruvada 727213081daSSrinivas Pandruvada 728213081daSSrinivas PandruvadaIntel(R) Speed Select Technology - Turbo Frequency (Intel(R) SST-TF) 729213081daSSrinivas Pandruvada-------------------------------------------------------------------- 730213081daSSrinivas Pandruvada 731213081daSSrinivas PandruvadaThis feature enables the ability to set different "All core turbo ratio limits" 732213081daSSrinivas Pandruvadato cores based on the priority. By using this feature, some cores can be 733213081daSSrinivas Pandruvadaconfigured to get higher turbo frequency by designating them as high priority at 734213081daSSrinivas Pandruvadathe cost of lower or no turbo frequency on the low priority cores. 735213081daSSrinivas Pandruvada 736213081daSSrinivas PandruvadaFor this reason, this feature is only useful when system is busy utilizing all 737213081daSSrinivas PandruvadaCPUs, but the user wants some configurable option to get high performance on 738213081daSSrinivas Pandruvadasome CPUs. 739213081daSSrinivas Pandruvada 740213081daSSrinivas PandruvadaThe support of Intel(R) Speed Select Technology - Turbo Frequency (Intel(R) SST-TF) 741213081daSSrinivas Pandruvadadepends on the Intel(R) Speed Select Technology - Performance Profile (Intel 742213081daSSrinivas PandruvadaSST-PP) performance level configuration. It is possible that only a certain 743213081daSSrinivas Pandruvadaperformance level supports Intel(R) SST-TF. It is also possible that only the base 744213081daSSrinivas Pandruvadaperformance level (level = 0) has the support of Intel(R) SST-TF. Hence, first 745213081daSSrinivas Pandruvadaselect the desired performance level to enable this feature. 746213081daSSrinivas Pandruvada 747213081daSSrinivas PandruvadaIn the system under test here, Intel(R) SST-TF is supported at the base 748213081daSSrinivas Pandruvadaperformance level 0, but currently disabled:: 749213081daSSrinivas Pandruvada 750213081daSSrinivas Pandruvada # intel-speed-select -c 0 perf-profile info -l 0 751213081daSSrinivas Pandruvada Intel(R) Speed Select Technology 752213081daSSrinivas Pandruvada package-0 753213081daSSrinivas Pandruvada die-0 754213081daSSrinivas Pandruvada cpu-0 755213081daSSrinivas Pandruvada perf-profile-level-0 756213081daSSrinivas Pandruvada ... 757213081daSSrinivas Pandruvada ... 758213081daSSrinivas Pandruvada speed-select-turbo-freq:disabled 759213081daSSrinivas Pandruvada ... 760213081daSSrinivas Pandruvada ... 761213081daSSrinivas Pandruvada 762213081daSSrinivas Pandruvada 763213081daSSrinivas PandruvadaTo check if performance can be improved using Intel(R) SST-TF feature, get the turbo 764213081daSSrinivas Pandruvadafrequency properties with Intel(R) SST-TF enabled and compare to the base turbo 765213081daSSrinivas Pandruvadacapability of this system. 766213081daSSrinivas Pandruvada 767213081daSSrinivas PandruvadaGet Base turbo capability 768213081daSSrinivas Pandruvada~~~~~~~~~~~~~~~~~~~~~~~~~ 769213081daSSrinivas Pandruvada 770213081daSSrinivas PandruvadaTo get the base turbo capability of performance level 0, execute:: 771213081daSSrinivas Pandruvada 772213081daSSrinivas Pandruvada # intel-speed-select perf-profile info -l 0 773213081daSSrinivas Pandruvada Intel(R) Speed Select Technology 774213081daSSrinivas Pandruvada Executing on CPU model: X 775213081daSSrinivas Pandruvada package-0 776213081daSSrinivas Pandruvada die-0 777213081daSSrinivas Pandruvada cpu-0 778213081daSSrinivas Pandruvada perf-profile-level-0 779213081daSSrinivas Pandruvada ... 780213081daSSrinivas Pandruvada ... 781213081daSSrinivas Pandruvada turbo-ratio-limits-sse 782213081daSSrinivas Pandruvada bucket-0 783213081daSSrinivas Pandruvada core-count:2 784213081daSSrinivas Pandruvada max-turbo-frequency(MHz):3200 785213081daSSrinivas Pandruvada bucket-1 786213081daSSrinivas Pandruvada core-count:4 787213081daSSrinivas Pandruvada max-turbo-frequency(MHz):3100 788213081daSSrinivas Pandruvada bucket-2 789213081daSSrinivas Pandruvada core-count:6 790213081daSSrinivas Pandruvada max-turbo-frequency(MHz):3100 791213081daSSrinivas Pandruvada bucket-3 792213081daSSrinivas Pandruvada core-count:8 793213081daSSrinivas Pandruvada max-turbo-frequency(MHz):3100 794213081daSSrinivas Pandruvada bucket-4 795213081daSSrinivas Pandruvada core-count:10 796213081daSSrinivas Pandruvada max-turbo-frequency(MHz):3100 797213081daSSrinivas Pandruvada bucket-5 798213081daSSrinivas Pandruvada core-count:12 799213081daSSrinivas Pandruvada max-turbo-frequency(MHz):3100 800213081daSSrinivas Pandruvada bucket-6 801213081daSSrinivas Pandruvada core-count:14 802213081daSSrinivas Pandruvada max-turbo-frequency(MHz):3100 803213081daSSrinivas Pandruvada bucket-7 804213081daSSrinivas Pandruvada core-count:16 805213081daSSrinivas Pandruvada max-turbo-frequency(MHz):3100 806213081daSSrinivas Pandruvada 807213081daSSrinivas PandruvadaBased on the data above, when all the CPUS are busy, the max. frequency of 3100 808213081daSSrinivas PandruvadaMHz can be achieved. If there is some busy workload on cpu 0 - 11 (e.g. stress) 809213081daSSrinivas Pandruvadaand on CPU 12 and 13, execute "hackbench pipe" workload:: 810213081daSSrinivas Pandruvada 811213081daSSrinivas Pandruvada # taskset -c 12,13 perf bench -r 100 sched pipe 812213081daSSrinivas Pandruvada # Running 'sched/pipe' benchmark: 813213081daSSrinivas Pandruvada # Executed 1000000 pipe operations between two processes 814213081daSSrinivas Pandruvada Total time: 5.705 [sec] 815213081daSSrinivas Pandruvada 5.705488 usecs/op 816213081daSSrinivas Pandruvada 175269 ops/sec 817213081daSSrinivas Pandruvada 818213081daSSrinivas PandruvadaThe turbostat output:: 819213081daSSrinivas Pandruvada 820213081daSSrinivas Pandruvada #turbostat -c 0-13 --show Package,Core,CPU,Bzy_MHz -i 1 821213081daSSrinivas Pandruvada Package Core CPU Bzy_MHz 822213081daSSrinivas Pandruvada 0 0 0 3000 823213081daSSrinivas Pandruvada 0 1 1 3000 824213081daSSrinivas Pandruvada 0 2 2 3000 825213081daSSrinivas Pandruvada 0 3 3 3000 826213081daSSrinivas Pandruvada 0 4 4 3000 827213081daSSrinivas Pandruvada 0 5 5 3100 828213081daSSrinivas Pandruvada 0 6 6 3100 829213081daSSrinivas Pandruvada 0 7 7 3000 830213081daSSrinivas Pandruvada 0 8 8 3100 831213081daSSrinivas Pandruvada 0 9 9 3000 832213081daSSrinivas Pandruvada 0 10 10 3000 833213081daSSrinivas Pandruvada 0 11 11 3000 834213081daSSrinivas Pandruvada 0 12 12 3100 835213081daSSrinivas Pandruvada 0 13 13 3100 836213081daSSrinivas Pandruvada 837213081daSSrinivas PandruvadaBased on turbostat output, the performance is limited by frequency cap of 3100 838213081daSSrinivas PandruvadaMHz. To check if the hackbench performance can be improved for CPU 12 and CPU 839213081daSSrinivas Pandruvada13, first check the capability of the Intel(R) SST-TF feature for this performance 840213081daSSrinivas Pandruvadalevel. 841213081daSSrinivas Pandruvada 842213081daSSrinivas PandruvadaGet Intel(R) SST-TF Capability 843213081daSSrinivas Pandruvada~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 844213081daSSrinivas Pandruvada 845213081daSSrinivas PandruvadaTo get the capability, the "turbo-freq info" command can be used:: 846213081daSSrinivas Pandruvada 847213081daSSrinivas Pandruvada # intel-speed-select turbo-freq info -l 0 848213081daSSrinivas Pandruvada Intel(R) Speed Select Technology 849213081daSSrinivas Pandruvada Executing on CPU model: X 850213081daSSrinivas Pandruvada package-0 851213081daSSrinivas Pandruvada die-0 852213081daSSrinivas Pandruvada cpu-0 853213081daSSrinivas Pandruvada speed-select-turbo-freq 854213081daSSrinivas Pandruvada bucket-0 855213081daSSrinivas Pandruvada high-priority-cores-count:2 856213081daSSrinivas Pandruvada high-priority-max-frequency(MHz):3200 857213081daSSrinivas Pandruvada high-priority-max-avx2-frequency(MHz):3200 858213081daSSrinivas Pandruvada high-priority-max-avx512-frequency(MHz):3100 859213081daSSrinivas Pandruvada bucket-1 860213081daSSrinivas Pandruvada high-priority-cores-count:4 861213081daSSrinivas Pandruvada high-priority-max-frequency(MHz):3100 862213081daSSrinivas Pandruvada high-priority-max-avx2-frequency(MHz):3000 863213081daSSrinivas Pandruvada high-priority-max-avx512-frequency(MHz):2900 864213081daSSrinivas Pandruvada bucket-2 865213081daSSrinivas Pandruvada high-priority-cores-count:6 866213081daSSrinivas Pandruvada high-priority-max-frequency(MHz):3100 867213081daSSrinivas Pandruvada high-priority-max-avx2-frequency(MHz):3000 868213081daSSrinivas Pandruvada high-priority-max-avx512-frequency(MHz):2900 869213081daSSrinivas Pandruvada speed-select-turbo-freq-clip-frequencies 870213081daSSrinivas Pandruvada low-priority-max-frequency(MHz):2600 871213081daSSrinivas Pandruvada low-priority-max-avx2-frequency(MHz):2400 872213081daSSrinivas Pandruvada low-priority-max-avx512-frequency(MHz):2100 873213081daSSrinivas Pandruvada 874213081daSSrinivas PandruvadaBased on the output above, there is an Intel(R) SST-TF bucket for which there are 875213081daSSrinivas Pandruvadatwo high priority cores. If only two high priority cores are set, then max. 876213081daSSrinivas Pandruvadaturbo frequency on those cores can be increased to 3200 MHz. This is 100 MHz 877213081daSSrinivas Pandruvadamore than the base turbo capability for all cores. 878213081daSSrinivas Pandruvada 879213081daSSrinivas PandruvadaIn turn, for the hackbench workload, two CPUs can be set as high priority and 880213081daSSrinivas Pandruvadarest as low priority. One side effect is that once enabled, the low priority 881213081daSSrinivas Pandruvadacores will be clipped to a lower frequency of 2600 MHz. 882213081daSSrinivas Pandruvada 883213081daSSrinivas PandruvadaEnable Intel(R) SST-TF 884213081daSSrinivas Pandruvada~~~~~~~~~~~~~~~~~~~~~~ 885213081daSSrinivas Pandruvada 886213081daSSrinivas PandruvadaTo enable Intel(R) SST-TF, execute:: 887213081daSSrinivas Pandruvada 888213081daSSrinivas Pandruvada # intel-speed-select -c 12,13 turbo-freq enable -a 889213081daSSrinivas Pandruvada Intel(R) Speed Select Technology 890213081daSSrinivas Pandruvada Executing on CPU model: X 891213081daSSrinivas Pandruvada package-0 892213081daSSrinivas Pandruvada die-0 893213081daSSrinivas Pandruvada cpu-12 894213081daSSrinivas Pandruvada turbo-freq 895213081daSSrinivas Pandruvada enable:success 896213081daSSrinivas Pandruvada package-0 897213081daSSrinivas Pandruvada die-0 898213081daSSrinivas Pandruvada cpu-13 899213081daSSrinivas Pandruvada turbo-freq 900213081daSSrinivas Pandruvada enable:success 901213081daSSrinivas Pandruvada package--1 902213081daSSrinivas Pandruvada die-0 903213081daSSrinivas Pandruvada cpu-63 904213081daSSrinivas Pandruvada turbo-freq --auto 905213081daSSrinivas Pandruvada enable:success 906213081daSSrinivas Pandruvada 907213081daSSrinivas PandruvadaIn this case, the option "-a" is optional. If set, it enables Intel(R) SST-TF 908b45225b4SRandy Dunlapfeature and also sets the CPUs to high and low priority using Intel Speed 909213081daSSrinivas PandruvadaSelect Technology Core Power (Intel(R) SST-CP) features. The CPU numbers passed 910213081daSSrinivas Pandruvadawith "-c" arguments are marked as high priority, including its siblings. 911213081daSSrinivas Pandruvada 912213081daSSrinivas PandruvadaIf -a option is not used, then the following steps are required before enabling 913213081daSSrinivas PandruvadaIntel(R) SST-TF: 914213081daSSrinivas Pandruvada 915213081daSSrinivas Pandruvada- Discover Intel(R) SST-TF and note buckets of high priority cores and maximum frequency 916213081daSSrinivas Pandruvada 917213081daSSrinivas Pandruvada- Enable CLOS using core-power feature set - Configure CLOS parameters 918213081daSSrinivas Pandruvada 919213081daSSrinivas Pandruvada- Subscribe desired CPUs to CLOS groups making sure that high priority cores are set to the maximum frequency 920213081daSSrinivas Pandruvada 921213081daSSrinivas PandruvadaIf the same hackbench workload is executed, schedule hackbench threads on high 922213081daSSrinivas Pandruvadapriority CPUs:: 923213081daSSrinivas Pandruvada 924213081daSSrinivas Pandruvada #taskset -c 12,13 perf bench -r 100 sched pipe 925213081daSSrinivas Pandruvada # Running 'sched/pipe' benchmark: 926213081daSSrinivas Pandruvada # Executed 1000000 pipe operations between two processes 927213081daSSrinivas Pandruvada Total time: 5.510 [sec] 928213081daSSrinivas Pandruvada 5.510165 usecs/op 929213081daSSrinivas Pandruvada 180826 ops/sec 930213081daSSrinivas Pandruvada 931213081daSSrinivas PandruvadaThis improved performance by around 3.3% improvement on a busy system. Here the 932213081daSSrinivas Pandruvadaturbostat output will show that the CPU 12 and CPU 13 are getting 100 MHz boost. 933213081daSSrinivas PandruvadaThe turbostat output:: 934213081daSSrinivas Pandruvada 935213081daSSrinivas Pandruvada #turbostat -c 0-13 --show Package,Core,CPU,Bzy_MHz -i 1 936213081daSSrinivas Pandruvada Package Core CPU Bzy_MHz 937213081daSSrinivas Pandruvada ... 938213081daSSrinivas Pandruvada 0 12 12 3200 939213081daSSrinivas Pandruvada 0 13 13 3200 940