1# How to Configure Phosphor-pid-control 2 3A system needs two groups of configurations: zones and sensors. 4 5They can come either from a dedicated config file or via D-Bus from 6e.g. `entity-manager`. 7 8## D-Bus Configuration 9 10If config file does not exist the configuration is obtained from a set of D-Bus 11interfaces. When using `entity-manager` to provide them refer to `Pid`, 12`Pid.Zone` and `Stepwise` 13[schemas](https://github.com/openbmc/entity-manager/tree/master/schemas). The 14key names are not identical to JSON but similar enough to see the 15correspondence. 16 17## Compile Flag Configuration 18 19### --strict-failsafe-pwm 20 21This build flag is used to set the fans strictly at the failsafe percent when 22in failsafe mode, even when the calculated PWM is higher than failsafe PWM. 23Without this enabled, the PWM is calculated and set to the calculated PWM 24**or** the failsafe PWM, whichever is higher. 25 26## JSON Configuration 27 28Default config file path `/usr/share/swampd/config.json` can be overridden by 29using `--conf` command line option. 30 31The JSON object should be a dictionary with two keys, `sensors` and `zones`. 32`sensors` is a list of the sensor dictionaries, whereas `zones` is a list of 33zones. 34 35### Sensors 36 37``` 38"sensors" : [ 39 { 40 "name": "fan1", 41 "type": "fan", 42 "readPath": "/xyz/openbmc_project/sensors/fan_tach/fan1", 43 "writePath": "/sys/devices/platform/ahb/ahb:apb/1e786000.pwm-tacho-controller/hwmon/**/pwm1", 44 "min": 0, 45 "max": 255, 46 "ignoreDbusMinMax": true 47 "unavailableAsFailed": true 48 }, 49 { 50 "name": "fan2", 51 "type": "fan", 52 "readPath": "/xyz/openbmc_project/sensors/fan_tach/fan2", 53 "writePath": "/sys/devices/platform/ahb/ahb:apb/1e786000.pwm-tacho-controller/hwmon/**/pwm2", 54 "min": 0, 55 "max": 255, 56 "timeout": 4, 57 }, 58... 59``` 60 61A sensor has a `name`, a `type`, a `readPath`, a `writePath`, a `minimum` value, 62a `maximum` value, a `timeout`, a `ignoreDbusMinMax` and a `unavailableAsFailed` value. 63 64The `name` is used to reference the sensor in the zone portion of the 65configuration. 66 67The `type` is the type of sensor it is. This influences how its value is 68treated. Supported values are: `fan`, `temp`, and `margin`. 69 70The `readPath` is the path that tells the daemon how to read the value from this 71sensor. It is optional, allowing for write-only sensors. If the value is absent 72or `None` it'll be treated as a write-only sensor. 73 74If the `readPath` value contains: `/xyz/openbmc_project/extsensors/` it'll be 75treated as a sensor hosted by the daemon itself whose value is provided 76externally. The daemon will own the sensor and publish it to dbus. This is 77currently only supported for `temp` and `margin` sensor types. 78 79If the `readPath` value contains: `/xyz/openbmc_project/` (this is checked after 80external), then it's treated as a passive dbus sensor. A passive dbus sensor is 81one that listens for property updates to receive its value instead of actively 82reading the `Value` property. 83 84If the `readPath` value contains: `/sys/` this is treated as a directly read 85sysfs path. There are two supported paths: 86 87* `/sys/class/hwmon/hwmon0/pwm1` 88* `/sys/devices/platform/ahb/1e786000.pwm-tacho-controller/hwmon/<asterisk 89 asterisk>/pwm1` 90 91The `writePath` is the path to set the value for the sensor. This is only valid 92for a sensor of type `fan`. The path is optional. If can be empty or `None`. It 93then only supports two options. 94 95If the `writePath` value contains: `/sys/` this is treated as a directory 96written sysfs path. There are two support paths: 97 98* `/sys/class/hwmon/hwmon0/pwm1` 99* `/sys/devices/platform/ahb/1e786000.pwm-tacho-controller/hwmon/<asterisk 100 asterisk>/pwm1` 101 102If the `writePath` value contains: `/xyz/openbmc_project/sensors/fan_tach/fan{N}` it 103sets of a sensor object that writes over dbus to the 104`xyz.openbmc_project.Control.FanPwm` interface. The `writePath` should be the 105full object path. 106 107``` 108busctl introspect xyz.openbmc_project.Hwmon-1644477290.Hwmon1 /xyz/openbmc_project/sensors/fan_tach/fan1 --no-pager 109NAME TYPE SIGNATURE RESULT/VALUE FLAGS 110org.freedesktop.DBus.Introspectable interface - - - 111.Introspect method - s - 112org.freedesktop.DBus.Peer interface - - - 113.GetMachineId method - s - 114.Ping method - - - 115org.freedesktop.DBus.Properties interface - - - 116.Get method ss v - 117.GetAll method s a{sv} - 118.Set method ssv - - 119.PropertiesChanged signal sa{sv}as - - 120xyz.openbmc_project.Control.FanPwm interface - - - 121.Target property t 255 emits-change writable 122xyz.openbmc_project.Sensor.Value interface - - - 123.MaxValue property x 0 emits-change writable 124.MinValue property x 0 emits-change writable 125.Scale property x 0 emits-change writable 126.Unit property s "xyz.openbmc_project.Sensor.Value.Uni... emits-change writable 127.Value property x 2823 emits-change writable 128``` 129 130The `minimum` and `maximum` values are optional. When `maximum` is non-zero it 131expects to write a percentage value converted to a value between the minimum and 132maximum. 133 134The `timeout` value is optional and controls the sensor failure behavior. If a 135sensor is a fan the default value is 2 seconds, otherwise it's 0. When a 136sensor's timeout is 0 it isn't checked against a read timeout failure case. If a 137sensor fails to be read within the timeout period, the zone goes into failsafe 138to handle the case where it doesn't know what to do -- as it doesn't have all 139its inputs. 140 141The `ignoreDbusMinMax` value is optional and defaults to false. The dbus 142passive sensors check for a `MinValue` and `MaxValue` and scale the incoming 143values via these. Setting this property to true will ignore `MinValue` and 144`MaxValue` from dbus and therefore won't call any passive value scaling. 145 146The `unavailableAsFailed` value is optional and defaults to true. However, 147some specific thermal sensors should not be treated as Failed when they are 148unavailable. For example, when a system is powered-off, its CPU/DIMM Temp sensors 149are unavailable, in such state these sensors should not be treated as Failed and 150trigger FailSafe. This is important for systems whose Fans are always on. 151For these specific sensors set this property to false. 152 153### Zones 154 155``` 156"zones" : [ 157 { 158 "id": 1, 159 "minThermalOutput": 3000.0, 160 "failsafePercent": 75.0, 161 "pids": [], 162... 163``` 164 165Each zone has its own fields, and a list of controllers. 166 167| field | type | meaning | 168| ------------------ | --------- | ----------------------------------------- | 169| `id` | `int64_t` | This is a unique identifier for the zone. | 170| `minThermalOutput` | `double` | This is the minimum value that should be considered from the thermal outputs. Commonly used as the minimum fan RPM.| 171| `failsafePercent` | `double` | If there is a fan PID, it will use this value if the zone goes into fail-safe as the output value written to the fan's sensors.| 172| `pids` | `list of strings` | Fan and thermal controllers used by the zone.| 173 174The `id` field here is used in the d-bus path to talk to the 175`xyz.openbmc_project.Control.Mode` interface. 176 177***TODO:*** Examine how the fan controller always treating its output as a 178percentage works for future cases. 179 180A zone collects all the setpoints and ceilings from the thermal 181controllers attached to it, selects the maximum setpoint, clamps it by 182the minimum ceiling and `minThermalOutput`; the result is used to 183control fans. 184 185### Controllers 186 187There are `fan`, `temp`, `margin` (PID), and `stepwise` (discrete steps) 188controllers. 189 190The `fan` PID is meant to drive fans or other cooling devices. It's 191expecting to get the setpoint value from the owning zone and then 192drive the fans to that value. 193 194A `temp` PID is meant to drive the setpoint given an absolute 195temperature value (higher value indicates warmer temperature). 196 197A `margin` PID is meant to drive the setpoint given a margin value 198(lower value indicates warmer temperature, in other words, it's the 199safety margin remaining expressed in degrees Celsius). 200 201The setpoint output from the thermal controllers is called `RPMSetpoint()` 202However, it doesn't need to be an RPM value. 203 204***TODO:*** Rename this method and others to not say necessarily RPM. 205 206Some PID configurations have fields in common, but may be interpreted 207differently. 208 209When using D-Bus, each configuration can have a list of strings called 210`Profiles`. In this case the controller will be loaded only if at 211least one of them is returned as `Current` from an object implementing 212`xyz.openbmc_project.Control.ThermalMode` interface (which can be 213anywhere on D-Bus). `swampd` will automatically reload full 214configuration whenever `Current` is changed. 215 216D-Bus `Name` attribute is used for indexing in certain cases so should 217be unique for all defined configurations. 218 219#### PID Field 220 221If the PID `type` is not `stepwise` then the PID field is defined as follows: 222 223| field | type | meaning | 224| -------------------- | -------- | ----------------------------------------- | 225| `samplePeriod` | `double` | How frequently the value is sampled. 0.1 for fans, 1.0 for temperatures.| 226| `proportionalCoeff` | `double` | The proportional coefficient. | 227| `integralCoeff` | `double` | The integral coefficient. | 228| `feedFwdOffsetCoeff` | `double` | The feed forward offset coefficient. | 229| `feedFwdGainCoeff` | `double` | The feed forward gain coefficient. | 230| `integralLimit_min` | `double` | The integral minimum clamp value. | 231| `integralLimit_max` | `double` | The integral maximum clamp value. | 232| `outLim_min` | `double` | The output minimum clamp value. | 233| `outLim_max` | `double` | The output maximum clamp value. | 234| `slewNeg` | `double` | Negative slew value to dampen output. | 235| `slewPos` | `double` | Positive slew value to accelerate output. | 236 237The units for the coefficients depend on the configuration of the PIDs. 238 239If the PID is a `margin` controller and its `setpoint` is in centigrade and 240output in RPM: proportionalCoeff is your p value in units: RPM/C and integral 241coefficient: RPM/C sec 242 243If the PID is a fan controller whose output is pwm: proportionalCoeff is %/RPM 244and integralCoeff is %/RPM sec. 245 246***NOTE:*** The sample periods are specified in the configuration as they are 247used in the PID computations, however, they are not truly configurable as they 248are used for the update periods for the fan and thermal sensors. 249 250#### type == "fan" 251 252``` 253"name": "fan1-5", 254"type": "fan", 255"inputs": ["fan1", "fan5"], 256"setpoint": 90.0, 257"pid": { 258... 259} 260``` 261 262The type `fan` builds a `FanController` PID. 263 264| field | type | meaning | 265| ---------- | ----------------- | ------------------------------------------- | 266| `name` | `string` | The name of the PID. This is just for humans and logging.| 267| `type` | `string` | `fan` | 268| `inputs` | `list of strings` | The names of the sensor(s) that are used as input and output for the PID loop.| 269| `setpoint` | `double` | Presently UNUSED | 270| `pid` | `dictionary` | A PID dictionary detailed above. | 271 272#### type == "margin" 273 274``` 275"name": "fleetingpid0", 276"type": "margin", 277"inputs": ["fleeting0"], 278"setpoint": 10, 279"pid": { 280... 281} 282``` 283 284The type `margin` builds a `ThermalController` PID. 285 286| field | type | meaning | 287| ---------- | ----------------- | ------------------------------------------- | 288| `name` | `string` | The name of the PID. This is just for humans and logging.| 289| `type` | `string` | `margin` | 290| `inputs` | `list of strings` | The names of the sensor(s) that are used as input for the PID loop.| 291| `setpoint` | `double` | The setpoint value for the thermal PID. The setpoint for the margin sensors.| 292| `pid` | `dictionary` | A PID dictionary detailed above. | 293 294Each input is normally a temperature difference between some hardware 295threshold and the current state. E.g. a CPU sensor can be reporting 296that it's 20 degrees below the point when it starts thermal 297throttling. So the lower the margin temperature, the higher the 298corresponding absolute value. 299 300Out of all the `inputs` the minimal value is selected and used as an 301input for the PID loop. 302 303The output of a `margin` PID loop is that it sets the setpoint value for the 304zone. It does this by adding the value to a list of values. The value chosen by 305the fan PIDs (in this cascade configuration) is the maximum value. 306 307#### type == "temp" 308 309Exactly the same as `margin` but all the inputs are supposed to be 310absolute temperatures and so the maximal value is used to feed the PID 311loop. 312 313#### type == "stepwise" 314``` 315"name": "temp1", 316"type": "stepwise", 317"inputs": ["temp1"], 318"setpoint": 30.0, 319"pid": { 320 "samplePeriod": 0.1, 321 "positiveHysteresis": 1.0, 322 "negativeHysteresis": 1.0, 323 "isCeiling": false, 324 "reading": { 325 "0": 45, 326 "1": 46, 327 "2": 47, 328 }, 329 "output": { 330 "0": 5000, 331 "1": 2400, 332 "2": 2600, 333 } 334} 335``` 336 337The type `stepwise` builds a `StepwiseController`. 338 339| field | type | meaning | 340| ---------- | ----------------- | ------------------------------------------- | 341| `name` | `string` | The name of the controller. This is just for humans and logging. | 342| `type` | `string` | `stepwise` | 343| `inputs` | `list of strings` | The names of the sensor(s) that are used as input and output for the controller. | 344| `pid` | `dictionary` | A controller settings dictionary detailed below. | 345 346The `pid` dictionary (confusingly named) is defined as follows: 347 348| field | type | meaning | 349| -------------------- | -------- | ----------------------------------------- | 350| `samplePeriod` | `double` | Presently UNUSED. | 351| `reading` | `dictionary` | Enumerated list of input values, indexed from 0, must be monotonically increasing, maximum 20 items. | 352| `output` | `dictionary` | Enumerated list of output values, indexed from 0, must match the amount of `reading` items. | 353| `positiveHysteresis` | `double` | How much the input value must raise to allow the switch to the next step. | 354| `negativeHysteresis` | `double` | How much the input value must drop to allow the switch to the previous step. | 355| `isCeiling` | `bool` | Whether this controller provides a setpoint or a ceiling for the zone | 356| `setpoint` | `double` | Presently UNUSED. | 357 358***NOTE:*** `reading` and `output` are normal arrays and not embedded 359in the dictionary in Entity Manager. 360 361Each measurement cycle out of all the `inputs` the maximum value is 362selected. Then it's compared to the list of `reading` values finding 363the largest that's still lower or equal the input (the very first item 364is used even if it's larger than the input). The corresponding 365`output` value is selected if hysteresis allows the switch (the 366current input value is compared with the input present at the moment 367of the previous switch). The result is added to the list of setpoints 368or ceilings for the zone depending on `isCeiling` setting. 369