# ExternalSensor in dbus-sensors Author: Josh Lehan[^1] Other contributors: Ed Tanous, Peter Lundgren, Alex Qiu Created: March 19, 2021 ## Introduction In OpenBMC, the _dbus-sensors_[^2] package contains a suite of sensor daemons. Each daemon monitors a particular type of sensor. This document provides rationale and motivation for adding _ExternalSensor_, another sensor daemon, and gives some example usages of it. ## Motivation There are 10 existing sensor daemons in _dbus-sensors_. Why add another sensor daemon? - Most of the existing sensor daemons are tied to one particular physical quantity they are measuring, such as temperature, and are hardcoded as such. An externally-updated sensor has no such limitation, and should be flexible enough to measure any physical quantity currently supported by OpenBMC. - Essentially all of the existing sensor daemons obtain the sensor values they publish to D-Bus by reading from local hardware (typically by reading from virtual files provided by the _hwmon_[^3] subsystem of the Linux kernel). None of the daemons are currently designed with the intention of accepting values pushed in from an external source. Although there is some debugging functionality to add this feature to other sensor daemons[^25], it is not the primary purpose for which they were designed. - Even if the debugging functionality of an existing daemon were to be used, the daemon would still need a valid configuration tied to recognized hardware, as detected by _entity-manager_[^4], in order for the daemon to properly initialize itself and participate in the OpenBMC software stack. - For the same reason it is desirable for existing sensor daemons to detect and properly indicate failures of their underlying hardware, it is desirable for _ExternalSensor_ to detect and properly indicate loss of timely sensor updates from their external source. This is a new feature, and does not cleanly fit into the architecture of any existing sensor daemon, thus a new daemon is the correct choice for this behavior. For these reasons, _ExternalSensor_ has been added[^5], as the eleventh sensor daemon in _dbus-sensors_. ## Design After some discussion, a proof-of-concept _HostSensor_[^6] was published. This was a stub, but it revealed the minimal implementation that would still be capable of fully initializing and participating in the OpenBMC software stack. _ExternalSensor_ was formed by using this example _HostSensor_, and also one of the simplest existing sensor daemons, _HwmonTempSensor_[^7], as references to build upon. As written, after validating parameters during initialization, there is essentially no work for _ExternalSensor_ to do. The main loop is mostly idle, remaining blocked in the Boost ASIO[^8] library, handling D-Bus requests as they come in. This utilizes the functionality in the underlying _Sensor_[^9] class, which already contains the D-Bus hooks necessary to receive values from the external source. An example external source is the IPMI service[^10], receiving values from the host via the IPMI "Set Sensor Reading" command[^11]. _ExternalSensor_ is intended to be source-agnostic, so it does not matter if this is IPMI or Redfish[^12] or something else in the future, as long as they are received similarly over D-Bus. ### Timeout The timeout feature is the primary feature which distinguishes _ExternalSensor_ from other sensor daemons. Once an external source starts providing updates, the external source is expected to continue to provide timely updates. Each update will be properly published onto D-Bus, in the usual way done by all sensor daemons, as a floating-point value. A timer is used, the same Boost ASIO[^13] timer mechanism used by other sensor daemons to poll their hardware, but in this case, is used to manage how long it has been since the last known good external update. When the timer expires, the sensor value will be deemed stale, and will be replaced with floating-point quiet _NaN_[^14]. ### NaN The advantage of floating-point _NaN_ is that it is a drop-in replacement for the valid floating-point value of the sensor. A subtle difference of the earlier OpenBMC sensor "Value" schema change, from integer to floating-point, is that the field is essentially now nullable. Instead of having to arbitrarily choose an arbitrary integer value to indicate "not valid", such as -1 or 9999 or whatever, floating-point explicitly has _NaN_ to indicate this. So, there is no possibility of confusion that this will be mistaken for a valid sensor value, as _NaN_ is literally _not a number_, and thus can not be misparsed as a valid sensor reading. It thus saves having to add a second field to reliably indicate validity, which would break the existing schema[^15]. An alternative to using _NaN_ for staleness indication would have been to use a timestamp, which would introduce the complication of having to parse and compare timestamps within OpenBMC, and all the subtle difficulties thereof[^16]. What's more, adding a second field might require a second D-Bus message to update, and D-Bus messages are computationally expensive[^17] and should be used sparingly. Periodic things like sensors, which send out regular updates, could easily lead to frequent D-Bus traffic and thus should be kept as minimal as practical. And finally, changing the Value schema would cause a large blast radius, both in design and in code, necessitating a large refactoring effort well beyond the scope of what is needed for _ExternalSensor_. ### Configuration Configuring a sensor for use with _ExternalSensor_ should be done in the usual way[^18] that is done for use with other sensor daemons, namely, a JSON dictionary that is an element of the "Exposes" array within a JSON configuration file to be read by _entity-manager_. In that JSON dictionary, the valid names are listed below. All of these are mandatory parameters, unless mentioned as optional. For fields listed as "Numeric" below, this means that it can be either integer or valid floating-point. - "Name": String. The sensor name, which this sensor will be known as. A mandatory component of the `entity-manager` configuration, and the resulting D-Bus object path. - "Units": String. This parameter is unique to _ExternalSensor_. As _ExternalSensor_ is not tied to any particular physical hardware, it can measure any physical quantity supported by OpenBMC. This string will be translated to another string via a lookup table[^19], and forms another mandatory component of the D-Bus object path. - "MinValue": Numeric. The minimum valid value for this sensor. Although not used by _ExternalSensor_ directly, it is a valuable hint for services such as IPMI, which need to know the minimum and maximum valid sensor values in order to scale their reporting range accurately. As _ExternalSensor_ is not tied to one particular physical quantity, there is no suitable default value for minimum and maximum. Thus, unlike other sensor daemons where this parameter is optional, in _ExternalSensor_ it is mandatory. - "MaxValue": Numeric. The maximum valid value for this sensor. It is treated similarly to "MinValue". - "Timeout": Numeric. This parameter is unique to _ExternalSensor_. It is the timeout value, in seconds. If this amount of time elapses with no new updates received over D-Bus from the external source, this sensor will be deemed stale. The value of this sensor will be replaced with floating-point _NaN_, as described above. This field is optional. If not given, the timeout feature will be disabled for this sensor (so it will never be deemed stale). - "Type": String. Must be exactly "ExternalSensor". This string is used by _ExternalSensor_ to obtain configuration information from _entity-manager_ during initialization. This string is what differentiates JSON stanzas intended for _ExternalSensor_ versus JSON stanzas intended for other _dbus-sensors_ sensor daemons. - "Thresholds": JSON dictionary. This field is optional. It is passed through to the main _Sensor_ class during initialization, similar to other sensor daemons. Other than that, it is not used by _ExternalSensor_. - "PowerState": String. This field is optional. Similarly to "Thresholds", it is passed through to the main _Sensor_ class during initialization. Here is an example. The sensor created by this stanza will form this object path: /xyz/openbmc_project/sensors/temperature/HostDevTemp ``` { "Name": "HostDevTemp", "Units": "DegreesC", "MinValue": -16.0, "MaxValue": 111.5, "Timeout": 4.0, "Type": "ExternalSensor" }, ``` There can be multiple _ExternalSensor_ sensors in the configuration. There is no set limit on the number of sensors, except what is supported by a service such as IPMI. ## Implementation As it stands now, _ExternalSensor_ is up and running[^20]. However, the timeout feature was originally implemented at the IPMI layer. Upon further investigation, it was found that IPMI was the wrong place for this feature, and that it should be moved within _ExternalSensor_ itself[^21]. It was originally thought that the timeout feature would be a useful enhancement available to all IPMI sensors, however, expected usage of almost all external sensor updates is a one-shot adjustment (for example, somebody wishes to change a voltage regulator setting, or fan speed setting). In this case, the timeout feature would not only not be necessary, it would get in the way and require additional coding[^22] to compensate for the unexpected _NaN_ value. Only sensors intended for use with _ExternalSensor_ are expected to receive continuous periodic updates from an external source, so it makes sense to move this timeout feature into _ExternalSensor_. This change also has the advantage of making _ExternalSensor_ not dependent on IPMI as the only source of external updates. A challenge of generalizing the timeout feature into _ExternalSensor_, however, was that the existing _Sensor_ base class did not currently allow its existing D-Bus setter hook to be customized. This feature was straightforward to add[^23]. One limitation was that the existing _Sensor_ class, by design, dropped updates that duplicated the existing sensor value. For use with _ExternalSensor_, we want to recognize all updates received, even duplicates, as they are important to pet the watchdog, to avoid inadvertently triggering the timeout feature. However, it is still important to avoid needlessly sending the D-Bus _PropertiesChanged_ event for duplicate readings. The timeout value was originally a compiled-in constant. If _ExternalSensor_ is to succeed as a general-purpose tool, this must be configurable. It was straightforward to add another configurable parameter[^24] to accept this timeout value, as shown in "Parameters" above. The hardest task of all, however, was getting it accepted upstream. If you are reading this, then most likely, it was successful! ## Footnotes [^1]: https://gerrit.openbmc.org/q/owner:krellan%2540google.com [^2]: https://github.com/openbmc/dbus-sensors/blob/master/README.md [^3]: https://www.kernel.org/doc/html/latest/hwmon/index.html [^4]: https://github.com/openbmc/entity-manager/blob/master/README.md [^5]: https://gerrit.openbmc.org/c/openbmc/dbus-sensors/+/36206 [^6]: https://gerrit.openbmc.org/c/openbmc/dbus-sensors/+/35476 [^7]: https://github.com/openbmc/dbus-sensors/blob/master/src/HwmonTempMain.cpp [^8]: https://think-async.com/Asio/ [^9]: https://github.com/openbmc/dbus-sensors/blob/master/include/sensor.hpp [^10]: https://github.com/openbmc/docs/blob/master/architecture/ipmi-architecture.md [^11]: https://www.intel.com/content/www/us/en/servers/ipmi/ipmi-intelligent-platform-mgt-interface-spec-2nd-gen-v2-0-spec-update.html [^12]: https://www.dmtf.org/standards/redfish [^13]: https://www.boost.org/doc/libs/1_75_0/doc/html/boost_asio/overview/timers.html [^14]: https://anniecherkaev.com/the-secret-life-of-nan [^15]: https://github.com/openbmc/phosphor-dbus-interfaces/blob/master/yaml/xyz/openbmc_project/Sensor/Value.interface.yaml [^16]: https://cr.yp.to/proto/utctai.html [^17]: https://github.com/openbmc/openbmc/issues/1892 [^18]: https://github.com/openbmc/entity-manager/blob/master/docs/my_first_sensors.md [^19]: https://github.com/openbmc/dbus-sensors/blob/master/src/SensorPaths.cpp [^20]: https://gerrit.openbmc.org/c/openbmc/dbus-sensors/+/36206 [^21]: https://gerrit.openbmc.org/c/openbmc/dbus-sensors/+/41398 [^22]: https://gerrit.openbmc.org/c/openbmc/dbus-sensors/+/39294 [^23]: https://gerrit.openbmc.org/c/openbmc/dbus-sensors/+/41394 [^24]: https://gerrit.openbmc.org/c/openbmc/entity-manager/+/41397 [^25]: https://gerrit.openbmc.org/c/openbmc/dbus-sensors/+/16177