1# phosphor-pid-control 2 3## Objective 4 5Develop a tray level fan control system that will use exhaust temperature and 6other machine temperature information to control fan speeds in order to keep 7machines within acceptable operating conditions. 8 9Effectively porting the Chromium EC thermal code to run on the BMC and use the 10OpenBMC dbus namespace and IPMI commands. 11 12## Background 13 14Recent server systems come with a general secondary processing system attached 15for the purpose of monitoring and control, generally referred to as a BMC[^2]. 16There is a large effort to develop an open source framework for writing 17applications and control systems that will run on the BMC, known as 18OpenBMC[^3]<sup>,</sup>[^4]. Within Google the effort has been internalized 19(while also providing upstream pushes) as gBMC[^5]. The primary goal of OpenBMC 20is to provide support for remote and local system management through a REST 21interface, and also through IPMI[^6] tied together via the system dbus. OpenBMC 22provides many applications and daemons that we can leverage and improve. 23 24The BMC is wired such that it has direct access and control over many 25motherboard components, including fans and temperature sensors[^7]. Therefore, 26it is an ideal location to run a thermal control loop, similar to the EC. 27However, to upstream it will need to follow (as best as possible) the OpenBMC 28specifications for communicating and executing[^8]. 29 30IPMI allows for OEM commands to provide custom information flow or system 31control with a BMC. OEM commands are already lined up for certain other accesses 32routed through the BMC, and can be upstreamed for others to use. 33 34## Overview 35 36The BMC will run a daemon that controls the fans by pre-defined zones. The 37application will use thermal control, such that each defined zone is kept 38within a range and adjusted based on thermal information provided from locally 39readable sensors as well as host-provided information over an IPMI OEM 40command. 41 42A system (or tray) will be broken out into one or more zones, specified via 43configuration files or dbus. Each zone will contain at least one fan and at 44least one temperature sensor and some device margins. The sensor data can 45be provided via sysfs, dbus, or through IPMI. In either case, default margins 46should be provided in case of failure or other unknown situation. 47 48The system will run a control loop for each zone with the attempt to maintain 49the temperature within that zone within the margin for the devices specified. 50 51## Detailed Design 52 53The software will run as a multi-threaded daemon that runs a control loop for 54each zone, and has a master thread which listens for dbus messages. Each zone 55will require at least one fan that it exclusively controls, however, zones can 56 share temperature sensors. 57 58![Swampd Architecture](swampd_diagram.png "Swampd Architecture") 59 60In this figure the communications channels between swampd and ipmid and 61phosphor-hwmon are laid out. 62 63### OpenBMC Upstream 64 65To be upstreamed to OpenBMC for use on open-power systems, we need to follow the 66OpenBMC code style specification[^9] and leverage the dbus framework for reading 67sensors and fan control[^10]. 68 69There is already a daemon, which given a configuration file for a hwmon device, 70will add it to the dbus objects namespace which handles queries for values such 71a temperature or fan speed and allows another process to control the fan 72speed[^11]. It is the goal to utilize this other daemon through dbus to read the 73onboard sensors and control the fans. 74 75Because of the present implementation of the dbus interfaces to require 76controlling the fans only via specifying the RPM target, whereas the driver 77we're using for Quanta-Q71l (the first system) only allows writing PWM. This 78can be controlled either directly or via dbus. 79 80### Zone Specification 81 82A configuration file will need to exist for each board, likely in YAML[^12]. 83Similar information will also be necessary for gsys, such that it knows what 84sensors to read and send to the BMC. Presently it does something similar with 85EC, so it shouldn't be unreasonable to do something similar. 86 87Each zone must have at least one fan that it exclusively controls. Each zone 88must have at least one temperature sensor, but they may be shared. 89 90The external devices specified in the zone must have default information as a 91fallback, while their current temperatures will be provided by gsys. Some 92devices adapt quickly and others slowly, and this distinction will need to be a 93factor and described in the configuration. 94 95The internal thermometers specified will be read via sysfs. 96 97#### A proposed configuration file: 98 99``` 100{ZONEID}: 101 {PIDID}: 102 type: "fan" | "margin" 103 ipmi: 104 {IPMI_ID} 105 name: "fan1" 106 readPath: "/xyz/openbmc_project/sensors/fan_tach/fan1" 107 writePath: "/sys/class/hwmon/hwmon0/pwm0" 108 pidinfo: 109 samplerate: 0.1 // sample time in seconds 110 p_coeff: 0.01 // coefficient for proportional 111 i_coeff: 0.001 // coefficient for integral 112 integral_limit: 113 min: 0 114 max: 100 115 output_limit: 116 min: 0 117 max: 100 118 slew_negative: 0 119 slew_positive: 0 120 {PIDID}: 121 type: "margin" 122 ipmi: 123 {IPMI_ID} 124 name: "sluggish0" 125 readPath: "/xyz/openbmc_project/sensors/external/sluggish0" 126 writePath: "" 127 pidinfo: 128 samplerate: 1 // sample time in seconds 129 p_coeff: 94.0 130 i_coeff: 2.0 131 integral_limit: 132 min: 3000 133 max: 10000 134 output_limit: 135 min: 3000 136 max: 10000 137 slew_negative: 0 138 slew_positive: 0 139``` 140 141### Chassis Delta 142 143Due to data center requirements, the delta between the outgoing air temperature 144and the environmental air temperature must be no greater than 15C. 145 146### IPMI Command Specification 147 148Gsys needs the ability to send to the BMC, the margin information on the devices 149that it knows how to read that the BMC cannot. There is no command in IPMI that 150currently supports this use-case, therefore it will be added as an OEM command. 151 152The state of the BMC readable temperature sensors can be read through normal 153IPMI commands and is already supported. 154 155#### OEM Set Control 156 157Gsys needs to be able to set the control of the thermal system to either 158automatic or manual. When manual, the daemon will effectively wait to be told to 159be put back in automatic mode. It is expected in this manual mode that something 160will be controlling the fans via the other commands. 161 162Manual mode is controlled by zone through the following OEM command: 163 164##### Request 165 166Byte | Purpose | Value 167---- | ------------ | ----------------------------------------------------- 168`00` | `netfn` | `0x2e` 169`01` | `command` | `0x04 (also using manual command)` 170`02` | `oem1` | `0xcf` 171`03` | `oem2` | `0xc2` 172`04` | `padding` | `0x00` 173`05` | `SubCommand` | `Get or Set. Get == 0, Set == 1` 174`06` | `ZoneId` | 175`07` | `Mode` | `If Set, Value 1 == Manual Mode, 0 == Automatic Mode` 176 177##### Response 178 179Byte | Purpose | Value 180---- | --------- | ----------------------------------------------------- 181`02` | `oem1` | `0xcf` 182`03` | `oem2` | `0xc2` 183`04` | `padding` | `0x00` 184`07` | `Mode` | `If Set, Value 1 == Manual Mode, 0 == Automatic Mode` 185 186#### OEM Get Failsafe Mode 187 188Gbmctool needs to be able to read back whether a zone is in failsafe mode. This 189setting is read-only because it's dynamically determined within Swampd per zone. 190 191Byte | Purpose | Value 192---- | ------------ | ---------------------------------- 193`00` | `netfn` | `0x2e` 194`01` | `command` | `0x04 (also using manual command)` 195`02` | `oem1` | `0xcf` 196`03` | `oem2` | `0xc2` 197`04` | `padding` | `0x00` 198`05` | `SubCommand` | `Get == 2` 199`06` | `ZoneId` | 200 201##### Response 202 203Byte | Purpose | Value 204---- | ---------- | ----------------------------------------------- 205`02` | `oem1` | `0xcf` 206`03` | `oem2` | `0xc2` 207`04` | `padding` | `0x00` 208`07` | `failsafe` | `1 == in Failsafe Mode, 0 not in failsafe mode` 209 210#### Set Sensor Value 211 212Gsys needs to update the thermal controller with information not necessarily 213available to the BMC. This will comprise of a list of temperature (or margin?) 214sensors that are updated by the set sensor command. Because they don't represent 215real sensors in the system, the set sensor handler can simply broadcast the 216update as a properties update on dbus when it receives the command over IPMI. 217 218#### Set Fan PWM 219 220Gsys can override a specific fan's PWM when we implement the set sensor IPMI 221command pathway. 222 223#### Get Fan Tach 224 225Gsys can read fan_tach through the normal IPMI interface presently exported for 226sensors. 227 228### Sensor Update Loop 229 230The plan is to listen for fan_tach updates for each fan in a background thread. 231This will receive an update from phosphor-hwmon each time it updates any sensor 232it cares about. 233 234By default phosphor-hwmon reads each sensor in turn and then sleeps for 1 235second. We'll be updating phosphor-hwmon to sleep for a shorter period -- how 236short though is still TBD. We'll also be updating phosphor-hwmon to support pwm 237as a target. 238 239### Thermal Control Loops 240 241Each zone will require a control loop that monitors the associated thermals and 242controls the fan(s). The EC PID loop is designed to hit the fans 10 times per 243second to drive them to the desired value and read the sensors once per second. 244We'll be receiving sensor updates with such regularly, however, at present it 245takes ~0.13s to read all 8 fans. Which can't be read constantly without bringing 246the system to its knees -- in that all CPU cycles would be spent reading the 247fans. TBD on how frequently we'll be reading the fan sensors and the impact this 248will have. 249 250### Main Thread 251 252The main thread will manage the other threads, and process the initial 253configuration files. It will also register a dbus handler for the OEM message. 254 255### Enabling Logging 256 257By default, swampd isn't compiled to log information. To compile it for tuning, 258you'll need to add: 259 260``` 261EXTRA_OEMAKE_append_YOUR_MACHINE = " CXXFLAGS='${CXXFLAGS} -D__TUNING_LOGGING__'" 262``` 263 264To the recipe. 265 266## Project Information 267 268This project is designed to be a daemon running within the OpenBMC environment. 269It will use a well-defined configuration file to control the temperature of the 270tray components to keep them within operating conditions. It will require 271coordinate with gsys and OpenBMC. Providing a host-side service upstream to talk 272to the BMC is beyond the scope of this project. 273 274## Security Considerations 275 276A rogue client on the host could send invalid thermal information causing 277physical damage to the system. There will be an effort to sanity check all input 278from gsys to alleviate this concern. 279 280## Privacy Considerations 281 282This device holds no user data, however, you could profile the types of jobs 283executed on the server by watching its temperatures. 284 285## Testing Plan 286 287Testing individual code logic will be handled through unit-tests, however some 288pieces of code rely on abstractions such that we can swap out dbus with 289something encapsulated such that testing can be done without strictly running on 290a real system. 291 292Testing the system on real hardware will be performed to verify: 293 2941. The fallback values are used when gsys isn't reporting. 2951. The system behaves as expected given the information it reads. 296 297Unit-tests will provide that we know it validates information from gsys properly 298as well as handles difficult to reproduce edge cases. 299 300The testing of this project on real hardware can likely fold into the general 301gBMC testing planned. 302 303## Code Notes 304 305Swampd's primary function is to drive the fans of a system given various inputs. 306 307### Layout 308 309The code is broken out into modules as follows: 310 311* `dbus` - Any read or write interface that uses dbus primarily. 312* `experiments` - Small execution paths that allow for fan examination 313including how quickly fans respond to changes. 314* `ipmi` - Manual control for any zone is handled by receiving an IPMI message. 315This holds the ipmid provider for receiving those messages and sending them 316onto swampd. 317* `notimpl` - These are read-only and write-only interface implementations that 318can be dropped into a pluggable sensor to make it complete. 319* `pid` - This contains all the PID associated code, including the zone 320definition, controller definition, and the PID computational code. 321* `scripts` - This contains the scripts that convert YAML into C++. 322* `sensors` - This contains a couple of sensor types including the pluggable 323sensor's definition. It also holds the sensor manager. 324* `sysfs` - This contains code that reads from or writes to sysfs. 325* `threads` - Most of swampd's threads run in this method where there's just a 326dbus bus that we manage. 327 328## Example System Configurations 329 330### Two Margin Sensors Into Three Fans (Non-Step PID) 331 332``` 333A single zone system where multiple margin thermal sensors are fed into one PID 334that generates the output RPM for a set of fans controlled by one PID. 335 336margin sensors as input to thermal pid 337 338fleeting0+---->+-------+ +-------+ Thermal PID sampled 339 | min()+--->+ PID | slower rate. 340fleeting1+---->+-------+ +---+---+ 341 | 342 | 343 | RPM set-point 344 Current RPM v 345 +--+-----+ 346 The Fan PID fan0+---> | New PWM +-->fan0 347 samples at a | | | 348 faster rate fan1+---> PID +---------->--->fan1 349 speeding up the | | | 350 fans. fan2+---> | +-->fan2 351 ^ +--------+ + 352 | | 353 +-------------------------------+ 354 RPM updated by PWM. 355``` 356 357## Notes 358 359[^2]: BMC - Board Management Controller 360[^3]: with url https://github.com/openbmc/openbmc 361[^4]: with url https://github.com/facebook/openbmc 362[^5]: with url http://go/gbmc 363[^6]: with url 364 http://www.intel.com/content/www/us/en/servers/ipmi/ipmi-second-gen-interface-spec-v2-rev1-1.html 365[^7]: Excluding temperature sensors on PCIe cards and other add-ons. 366[^8]: They prefer c++. 367[^9]: With url 368 https://github.com/openbmc/docs/blob/master/cpp-style-and-conventions.md 369[^10]: with url https://github.com/openbmc/phosphor-dbus-interfaces 370[^11]: with url https://github.com/openbmc/phosphor-hwmon 371[^12]: YAML appears to be the configuration language of choice for OpenBMC. 372