1# How to Configure Phosphor-pid-control
2
3A system needs two groups of configurations: zones and sensors.
4
5They can come either from a dedicated config file or via D-Bus from e.g.
6`entity-manager`.
7
8## D-Bus Configuration
9
10If config file does not exist the configuration is obtained from a set of D-Bus
11interfaces. When using `entity-manager` to provide them refer to `Pid`,
12`Pid.Zone` and `Stepwise`
13[schemas](https://github.com/openbmc/entity-manager/tree/master/schemas). The
14key names are not identical to JSON but similar enough to see the
15correspondence.
16
17## Compile Flag Configuration
18
19### --strict-failsafe-pwm
20
21This build flag is used to set the fans strictly at the failsafe percent when in
22failsafe mode, even when the calculated PWM is higher than failsafe PWM. Without
23this enabled, the PWM is calculated and set to the calculated PWM **or** the
24failsafe PWM, whichever is higher.
25
26### --offline-failsafe-pwm
27
28This build flag is used to set the fans to failsafe percent when offline. The
29controller is offline when it's rebuilding the configuration or when it's about
30to shutdown.
31
32## JSON Configuration
33
34Default config file path `/usr/share/swampd/config.json` can be overridden by
35using `--conf` command line option.
36
37The JSON object should be a dictionary with two keys, `sensors` and `zones`.
38`sensors` is a list of the sensor dictionaries, whereas `zones` is a list of
39zones.
40
41### Sensors
42
43```
44"sensors" : [
45    {
46        "name": "fan1",
47        "type": "fan",
48        "readPath": "/xyz/openbmc_project/sensors/fan_tach/fan1",
49        "writePath": "/sys/devices/platform/ahb/ahb:apb/1e786000.pwm-tacho-controller/hwmon/**/pwm1",
50        "min": 0,
51        "max": 255,
52        "ignoreDbusMinMax": true
53        "unavailableAsFailed": true
54    },
55    {
56        "name": "fan2",
57        "type": "fan",
58        "readPath": "/xyz/openbmc_project/sensors/fan_tach/fan2",
59        "writePath": "/sys/devices/platform/ahb/ahb:apb/1e786000.pwm-tacho-controller/hwmon/**/pwm2",
60        "min": 0,
61        "max": 255,
62        "timeout": 4,
63    },
64...
65```
66
67A sensor has a `name`, a `type`, a `readPath`, a `writePath`, a `minimum` value,
68a `maximum` value, a `timeout`, a `ignoreDbusMinMax` and a `unavailableAsFailed`
69value.
70
71The `name` is used to reference the sensor in the zone portion of the
72configuration.
73
74The `type` is the type of sensor it is. This influences how its value is
75treated. Supported values are: `fan`, `temp`, and `margin`.
76
77The `readPath` is the path that tells the daemon how to read the value from this
78sensor. It is optional, allowing for write-only sensors. If the value is absent
79or `None` it'll be treated as a write-only sensor.
80
81If the `readPath` value contains: `/xyz/openbmc_project/extsensors/` it'll be
82treated as a sensor hosted by the daemon itself whose value is provided
83externally. The daemon will own the sensor and publish it to dbus. This is
84currently only supported for `temp` and `margin` sensor types.
85
86If the `readPath` value contains: `/xyz/openbmc_project/` (this is checked after
87external), then it's treated as a passive dbus sensor. A passive dbus sensor is
88one that listens for property updates to receive its value instead of actively
89reading the `Value` property.
90
91If the `readPath` value contains: `/sys/` this is treated as a directly read
92sysfs path. There are two supported paths:
93
94- `/sys/class/hwmon/hwmon0/pwm1`
95- `/sys/devices/platform/ahb/1e786000.pwm-tacho-controller/hwmon/<asterisk asterisk>/pwm1`
96
97The `writePath` is the path to set the value for the sensor. This is only valid
98for a sensor of type `fan`. The path is optional. If can be empty or `None`. It
99then only supports two options.
100
101If the `writePath` value contains: `/sys/` this is treated as a directory
102written sysfs path. There are two support paths:
103
104- `/sys/class/hwmon/hwmon0/pwm1`
105- `/sys/devices/platform/ahb/1e786000.pwm-tacho-controller/hwmon/<asterisk asterisk>/pwm1`
106
107If the `writePath` value contains:
108`/xyz/openbmc_project/sensors/fan_tach/fan{N}` it sets of a sensor object that
109writes over dbus to the `xyz.openbmc_project.Control.FanPwm` interface. The
110`writePath` should be the full object path.
111
112```
113busctl introspect xyz.openbmc_project.Hwmon-1644477290.Hwmon1 /xyz/openbmc_project/sensors/fan_tach/fan1 --no-pager
114NAME                                TYPE      SIGNATURE RESULT/VALUE                             FLAGS
115org.freedesktop.DBus.Introspectable interface -         -                                        -
116.Introspect                         method    -         s                                        -
117org.freedesktop.DBus.Peer           interface -         -                                        -
118.GetMachineId                       method    -         s                                        -
119.Ping                               method    -         -                                        -
120org.freedesktop.DBus.Properties     interface -         -                                        -
121.Get                                method    ss        v                                        -
122.GetAll                             method    s         a{sv}                                    -
123.Set                                method    ssv       -                                        -
124.PropertiesChanged                  signal    sa{sv}as  -                                        -
125xyz.openbmc_project.Control.FanPwm  interface -         -                                        -
126.Target                             property  t         255                                      emits-change writable
127xyz.openbmc_project.Sensor.Value    interface -         -                                        -
128.MaxValue                           property  x         0                                        emits-change writable
129.MinValue                           property  x         0                                        emits-change writable
130.Scale                              property  x         0                                        emits-change writable
131.Unit                               property  s         "xyz.openbmc_project.Sensor.Value.Uni... emits-change writable
132.Value                              property  x         2823                                     emits-change writable
133```
134
135The `minimum` and `maximum` values are optional. When `maximum` is non-zero it
136expects to write a percentage value converted to a value between the minimum and
137maximum.
138
139The `timeout` value is optional and controls the sensor failure behavior. If a
140sensor is a fan the default value is 2 seconds, otherwise it's 0. When a
141sensor's timeout is 0 it isn't checked against a read timeout failure case. If a
142sensor fails to be read within the timeout period, the zone goes into failsafe
143to handle the case where it doesn't know what to do -- as it doesn't have all
144its inputs.
145
146The `ignoreDbusMinMax` value is optional and defaults to false. The dbus passive
147sensors check for a `MinValue` and `MaxValue` and scale the incoming values via
148these. Setting this property to true will ignore `MinValue` and `MaxValue` from
149dbus and therefore won't call any passive value scaling.
150
151The `unavailableAsFailed` value is optional and defaults to true. However, some
152specific thermal sensors should not be treated as Failed when they are
153unavailable. For example, when a system is powered-off, its CPU/DIMM Temp
154sensors are unavailable, in such state these sensors should not be treated as
155Failed and trigger FailSafe. This is important for systems whose Fans are always
156on. For these specific sensors set this property to false.
157
158### Zones
159
160```
161"zones" : [
162        {
163            "id": 1,
164            "minThermalOutput": 3000.0,
165            "failsafePercent": 75.0,
166            "pids": [],
167...
168```
169
170Each zone has its own fields, and a list of controllers.
171
172| field              | type              | meaning                                                                                                                         |
173| ------------------ | ----------------- | ------------------------------------------------------------------------------------------------------------------------------- |
174| `id`               | `int64_t`         | This is a unique identifier for the zone.                                                                                       |
175| `minThermalOutput` | `double`          | This is the minimum value that should be considered from the thermal outputs. Commonly used as the minimum fan RPM.             |
176| `failsafePercent`  | `double`          | If there is a fan PID, it will use this value if the zone goes into fail-safe as the output value written to the fan's sensors. |
177| `pids`             | `list of strings` | Fan and thermal controllers used by the zone.                                                                                   |
178
179The `id` field here is used in the d-bus path to talk to the
180`xyz.openbmc_project.Control.Mode` interface.
181
182**_TODO:_** Examine how the fan controller always treating its output as a
183percentage works for future cases.
184
185A zone collects all the setpoints and ceilings from the thermal controllers
186attached to it, selects the maximum setpoint, clamps it by the minimum ceiling
187and `minThermalOutput`; the result is used to control fans.
188
189### Controllers
190
191There are `fan`, `temp`, `margin` (PID), and `stepwise` (discrete steps)
192controllers.
193
194The `fan` PID is meant to drive fans or other cooling devices. It's expecting to
195get the setpoint value from the owning zone and then drive the fans to that
196value.
197
198A `temp` PID is meant to drive the setpoint given an absolute temperature value
199(higher value indicates warmer temperature).
200
201A `margin` PID is meant to drive the setpoint given a margin value (lower value
202indicates warmer temperature, in other words, it's the safety margin remaining
203expressed in degrees Celsius).
204
205The setpoint output from the thermal controllers is called `RPMSetpoint()`
206However, it doesn't need to be an RPM value.
207
208**_TODO:_** Rename this method and others to not say necessarily RPM.
209
210Some PID configurations have fields in common, but may be interpreted
211differently.
212
213When using D-Bus, each configuration can have a list of strings called
214`Profiles`. In this case the controller will be loaded only if at least one of
215them is returned as `Current` from an object implementing
216`xyz.openbmc_project.Control.ThermalMode` interface (which can be anywhere on
217D-Bus). `swampd` will automatically reload full configuration whenever `Current`
218is changed.
219
220D-Bus `Name` attribute is used for indexing in certain cases so should be unique
221for all defined configurations.
222
223#### PID Field
224
225If the PID `type` is not `stepwise` then the PID field is defined as follows:
226
227| field                | type     | meaning                                                                  |
228| -------------------- | -------- | ------------------------------------------------------------------------ |
229| `samplePeriod`       | `double` | How frequently the value is sampled. 0.1 for fans, 1.0 for temperatures. |
230| `proportionalCoeff`  | `double` | The proportional coefficient.                                            |
231| `integralCoeff`      | `double` | The integral coefficient.                                                |
232| `feedFwdOffsetCoeff` | `double` | The feed forward offset coefficient.                                     |
233| `feedFwdGainCoeff`   | `double` | The feed forward gain coefficient.                                       |
234| `integralLimit_min`  | `double` | The integral minimum clamp value.                                        |
235| `integralLimit_max`  | `double` | The integral maximum clamp value.                                        |
236| `outLim_min`         | `double` | The output minimum clamp value.                                          |
237| `outLim_max`         | `double` | The output maximum clamp value.                                          |
238| `slewNeg`            | `double` | Negative slew value to dampen output.                                    |
239| `slewPos`            | `double` | Positive slew value to accelerate output.                                |
240
241The units for the coefficients depend on the configuration of the PIDs.
242
243If the PID is a `margin` controller and its `setpoint` is in centigrade and
244output in RPM: proportionalCoeff is your p value in units: RPM/C and integral
245coefficient: RPM/C sec
246
247If the PID is a fan controller whose output is pwm: proportionalCoeff is %/RPM
248and integralCoeff is %/RPM sec.
249
250**_NOTE:_** The sample periods are specified in the configuration as they are
251used in the PID computations, however, they are not truly configurable as they
252are used for the update periods for the fan and thermal sensors.
253
254#### type == "fan"
255
256```
257"name": "fan1-5",
258"type": "fan",
259"inputs": ["fan1", "fan5"],
260"setpoint": 90.0,
261"pid": {
262...
263}
264```
265
266The type `fan` builds a `FanController` PID.
267
268| field      | type              | meaning                                                                        |
269| ---------- | ----------------- | ------------------------------------------------------------------------------ |
270| `name`     | `string`          | The name of the PID. This is just for humans and logging.                      |
271| `type`     | `string`          | `fan`                                                                          |
272| `inputs`   | `list of strings` | The names of the sensor(s) that are used as input and output for the PID loop. |
273| `setpoint` | `double`          | Presently UNUSED                                                               |
274| `pid`      | `dictionary`      | A PID dictionary detailed above.                                               |
275
276#### type == "margin"
277
278```
279"name": "fleetingpid0",
280"type": "margin",
281"inputs": ["fleeting0"],
282"setpoint": 10,
283"pid": {
284...
285}
286```
287
288The type `margin` builds a `ThermalController` PID.
289
290| field      | type              | meaning                                                                      |
291| ---------- | ----------------- | ---------------------------------------------------------------------------- |
292| `name`     | `string`          | The name of the PID. This is just for humans and logging.                    |
293| `type`     | `string`          | `margin`                                                                     |
294| `inputs`   | `list of strings` | The names of the sensor(s) that are used as input for the PID loop.          |
295| `setpoint` | `double`          | The setpoint value for the thermal PID. The setpoint for the margin sensors. |
296| `pid`      | `dictionary`      | A PID dictionary detailed above.                                             |
297
298Each input is normally a temperature difference between some hardware threshold
299and the current state. E.g. a CPU sensor can be reporting that it's 20 degrees
300below the point when it starts thermal throttling. So the lower the margin
301temperature, the higher the corresponding absolute value.
302
303Out of all the `inputs` the minimal value is selected and used as an input for
304the PID loop.
305
306The output of a `margin` PID loop is that it sets the setpoint value for the
307zone. It does this by adding the value to a list of values. The value chosen by
308the fan PIDs (in this cascade configuration) is the maximum value.
309
310#### type == "temp"
311
312Exactly the same as `margin` but all the inputs are supposed to be absolute
313temperatures and so the maximal value is used to feed the PID loop.
314
315#### type == "stepwise"
316
317```
318"name": "temp1",
319"type": "stepwise",
320"inputs": ["temp1"],
321"setpoint": 30.0,
322"pid": {
323  "samplePeriod": 0.1,
324  "positiveHysteresis": 1.0,
325  "negativeHysteresis": 1.0,
326  "isCeiling": false,
327  "reading": {
328    "0": 45,
329    "1": 46,
330    "2": 47,
331  },
332  "output": {
333    "0": 5000,
334    "1": 2400,
335    "2": 2600,
336  }
337}
338```
339
340The type `stepwise` builds a `StepwiseController`.
341
342| field    | type              | meaning                                                                          |
343| -------- | ----------------- | -------------------------------------------------------------------------------- |
344| `name`   | `string`          | The name of the controller. This is just for humans and logging.                 |
345| `type`   | `string`          | `stepwise`                                                                       |
346| `inputs` | `list of strings` | The names of the sensor(s) that are used as input and output for the controller. |
347| `pid`    | `dictionary`      | A controller settings dictionary detailed below.                                 |
348
349The `pid` dictionary (confusingly named) is defined as follows:
350
351| field                | type         | meaning                                                                                              |
352| -------------------- | ------------ | ---------------------------------------------------------------------------------------------------- |
353| `samplePeriod`       | `double`     | Presently UNUSED.                                                                                    |
354| `reading`            | `dictionary` | Enumerated list of input values, indexed from 0, must be monotonically increasing, maximum 20 items. |
355| `output`             | `dictionary` | Enumerated list of output values, indexed from 0, must match the amount of `reading` items.          |
356| `positiveHysteresis` | `double`     | How much the input value must raise to allow the switch to the next step.                            |
357| `negativeHysteresis` | `double`     | How much the input value must drop to allow the switch to the previous step.                         |
358| `isCeiling`          | `bool`       | Whether this controller provides a setpoint or a ceiling for the zone                                |
359| `setpoint`           | `double`     | Presently UNUSED.                                                                                    |
360
361**_NOTE:_** `reading` and `output` are normal arrays and not embedded in the
362dictionary in Entity Manager.
363
364Each measurement cycle out of all the `inputs` the maximum value is selected.
365Then it's compared to the list of `reading` values finding the largest that's
366still lower or equal the input (the very first item is used even if it's larger
367than the input). The corresponding `output` value is selected if hysteresis
368allows the switch (the current input value is compared with the input present at
369the moment of the previous switch). The result is added to the list of setpoints
370or ceilings for the zone depending on `isCeiling` setting.
371