10ac32b44SJosh Lehan# ExternalSensor in dbus-sensors 20ac32b44SJosh Lehan 30ac32b44SJosh LehanAuthor: Josh Lehan[^1] 40ac32b44SJosh Lehan 50ac32b44SJosh LehanOther contributors: Ed Tanous, Peter Lundgren, Alex Qiu 60ac32b44SJosh Lehan 70ac32b44SJosh LehanCreated: March 19, 2021 80ac32b44SJosh Lehan 90ac32b44SJosh Lehan## Introduction 100ac32b44SJosh Lehan 110ac32b44SJosh LehanIn OpenBMC, the _dbus-sensors_[^2] package contains a suite of sensor daemons. 120ac32b44SJosh LehanEach daemon monitors a particular type of sensor. This document provides 130ac32b44SJosh Lehanrationale and motivation for adding _ExternalSensor_, another sensor daemon, and 140ac32b44SJosh Lehangives some example usages of it. 150ac32b44SJosh Lehan 160ac32b44SJosh Lehan## Motivation 170ac32b44SJosh Lehan 180ac32b44SJosh LehanThere are 10 existing sensor daemons in _dbus-sensors_. Why add another sensor 190ac32b44SJosh Lehandaemon? 200ac32b44SJosh Lehan 21f4febd00SPatrick Williams- Most of the existing sensor daemons are tied to one particular physical 220ac32b44SJosh Lehan quantity they are measuring, such as temperature, and are hardcoded as such. 230ac32b44SJosh Lehan An externally-updated sensor has no such limitation, and should be flexible 240ac32b44SJosh Lehan enough to measure any physical quantity currently supported by OpenBMC. 250ac32b44SJosh Lehan 26f4febd00SPatrick Williams- Essentially all of the existing sensor daemons obtain the sensor values they 270ac32b44SJosh Lehan publish to D-Bus by reading from local hardware (typically by reading from 28f4febd00SPatrick Williams virtual files provided by the _hwmon_[^3] subsystem of the Linux kernel). None 29f4febd00SPatrick Williams of the daemons are currently designed with the intention of accepting values 30f4febd00SPatrick Williams pushed in from an external source. Although there is some debugging 31f4febd00SPatrick Williams functionality to add this feature to other sensor daemons[^25], it is not the 32f4febd00SPatrick Williams primary purpose for which they were designed. 330ac32b44SJosh Lehan 34f4febd00SPatrick Williams- Even if the debugging functionality of an existing daemon were to be used, the 35f4febd00SPatrick Williams daemon would still need a valid configuration tied to recognized hardware, as 36f4febd00SPatrick Williams detected by _entity-manager_[^4], in order for the daemon to properly 37f4febd00SPatrick Williams initialize itself and participate in the OpenBMC software stack. 380ac32b44SJosh Lehan 39f4febd00SPatrick Williams- For the same reason it is desirable for existing sensor daemons to detect and 40f4febd00SPatrick Williams properly indicate failures of their underlying hardware, it is desirable for 41f4febd00SPatrick Williams _ExternalSensor_ to detect and properly indicate loss of timely sensor updates 42f4febd00SPatrick Williams from their external source. This is a new feature, and does not cleanly fit 43f4febd00SPatrick Williams into the architecture of any existing sensor daemon, thus a new daemon is the 44f4febd00SPatrick Williams correct choice for this behavior. 450ac32b44SJosh Lehan 460ac32b44SJosh LehanFor these reasons, _ExternalSensor_ has been added[^5], as the eleventh sensor 470ac32b44SJosh Lehandaemon in _dbus-sensors_. 480ac32b44SJosh Lehan 490ac32b44SJosh Lehan## Design 500ac32b44SJosh Lehan 510ac32b44SJosh LehanAfter some discussion, a proof-of-concept _HostSensor_[^6] was published. This 520ac32b44SJosh Lehanwas a stub, but it revealed the minimal implementation that would still be 530ac32b44SJosh Lehancapable of fully initializing and participating in the OpenBMC software stack. 540ac32b44SJosh Lehan_ExternalSensor_ was formed by using this example _HostSensor_, and also one of 550ac32b44SJosh Lehanthe simplest existing sensor daemons, _HwmonTempSensor_[^7], as references to 560ac32b44SJosh Lehanbuild upon. 570ac32b44SJosh Lehan 580ac32b44SJosh LehanAs written, after validating parameters during initialization, there is 590ac32b44SJosh Lehanessentially no work for _ExternalSensor_ to do. The main loop is mostly idle, 600ac32b44SJosh Lehanremaining blocked in the Boost ASIO[^8] library, handling D-Bus requests as they 610ac32b44SJosh Lehancome in. This utilizes the functionality in the underlying _Sensor_[^9] class, 620ac32b44SJosh Lehanwhich already contains the D-Bus hooks necessary to receive values from the 630ac32b44SJosh Lehanexternal source. 640ac32b44SJosh Lehan 650ac32b44SJosh LehanAn example external source is the IPMI service[^10], receiving values from the 660ac32b44SJosh Lehanhost via the IPMI "Set Sensor Reading" command[^11]. _ExternalSensor_ is 670ac32b44SJosh Lehanintended to be source-agnostic, so it does not matter if this is IPMI or 680ac32b44SJosh LehanRedfish[^12] or something else in the future, as long as they are received 690ac32b44SJosh Lehansimilarly over D-Bus. 700ac32b44SJosh Lehan 710ac32b44SJosh Lehan### Timeout 720ac32b44SJosh Lehan 730ac32b44SJosh LehanThe timeout feature is the primary feature which distinguishes _ExternalSensor_ 740ac32b44SJosh Lehanfrom other sensor daemons. Once an external source starts providing updates, the 750ac32b44SJosh Lehanexternal source is expected to continue to provide timely updates. Each update 760ac32b44SJosh Lehanwill be properly published onto D-Bus, in the usual way done by all sensor 770ac32b44SJosh Lehandaemons, as a floating-point value. 780ac32b44SJosh Lehan 790ac32b44SJosh LehanA timer is used, the same Boost ASIO[^13] timer mechanism used by other sensor 800ac32b44SJosh Lehandaemons to poll their hardware, but in this case, is used to manage how long it 810ac32b44SJosh Lehanhas been since the last known good external update. When the timer expires, the 820ac32b44SJosh Lehansensor value will be deemed stale, and will be replaced with floating-point 830ac32b44SJosh Lehanquiet _NaN_[^14]. 840ac32b44SJosh Lehan 850ac32b44SJosh Lehan### NaN 860ac32b44SJosh Lehan 870ac32b44SJosh LehanThe advantage of floating-point _NaN_ is that it is a drop-in replacement for 880ac32b44SJosh Lehanthe valid floating-point value of the sensor. A subtle difference of the earlier 890ac32b44SJosh LehanOpenBMC sensor "Value" schema change, from integer to floating-point, is that 900ac32b44SJosh Lehanthe field is essentially now nullable. Instead of having to arbitrarily choose 910ac32b44SJosh Lehanan arbitrary integer value to indicate "not valid", such as -1 or 9999 or 920ac32b44SJosh Lehanwhatever, floating-point explicitly has _NaN_ to indicate this. So, there is no 930ac32b44SJosh Lehanpossibility of confusion that this will be mistaken for a valid sensor value, as 940ac32b44SJosh Lehan_NaN_ is literally _not a number_, and thus can not be misparsed as a valid 950ac32b44SJosh Lehansensor reading. It thus saves having to add a second field to reliably indicate 960ac32b44SJosh Lehanvalidity, which would break the existing schema[^15]. 970ac32b44SJosh Lehan 980ac32b44SJosh LehanAn alternative to using _NaN_ for staleness indication would have been to use a 990ac32b44SJosh Lehantimestamp, which would introduce the complication of having to parse and compare 1000ac32b44SJosh Lehantimestamps within OpenBMC, and all the subtle difficulties thereof[^16]. What's 1010ac32b44SJosh Lehanmore, adding a second field might require a second D-Bus message to update, and 1020ac32b44SJosh LehanD-Bus messages are computationally expensive[^17] and should be used sparingly. 1030ac32b44SJosh LehanPeriodic things like sensors, which send out regular updates, could easily lead 1040ac32b44SJosh Lehanto frequent D-Bus traffic and thus should be kept as minimal as practical. And 1050ac32b44SJosh Lehanfinally, changing the Value schema would cause a large blast radius, both in 1060ac32b44SJosh Lehandesign and in code, necessitating a large refactoring effort well beyond the 1070ac32b44SJosh Lehanscope of what is needed for _ExternalSensor_. 1080ac32b44SJosh Lehan 1090ac32b44SJosh Lehan### Configuration 1100ac32b44SJosh Lehan 1110ac32b44SJosh LehanConfiguring a sensor for use with _ExternalSensor_ should be done in the usual 1120ac32b44SJosh Lehanway[^18] that is done for use with other sensor daemons, namely, a JSON 1130ac32b44SJosh Lehandictionary that is an element of the "Exposes" array within a JSON configuration 1140ac32b44SJosh Lehanfile to be read by _entity-manager_. In that JSON dictionary, the valid names 1150ac32b44SJosh Lehanare listed below. All of these are mandatory parameters, unless mentioned as 1160ac32b44SJosh Lehanoptional. For fields listed as "Numeric" below, this means that it can be either 1170ac32b44SJosh Lehaninteger or valid floating-point. 1180ac32b44SJosh Lehan 119f4febd00SPatrick Williams- "Name": String. The sensor name, which this sensor will be known as. A 1200ac32b44SJosh Lehan mandatory component of the `entity-manager` configuration, and the resulting 1210ac32b44SJosh Lehan D-Bus object path. 1220ac32b44SJosh Lehan 123f4febd00SPatrick Williams- "Units": String. This parameter is unique to _ExternalSensor_. As 1240ac32b44SJosh Lehan _ExternalSensor_ is not tied to any particular physical hardware, it can 1250ac32b44SJosh Lehan measure any physical quantity supported by OpenBMC. This string will be 1260ac32b44SJosh Lehan translated to another string via a lookup table[^19], and forms another 1270ac32b44SJosh Lehan mandatory component of the D-Bus object path. 1280ac32b44SJosh Lehan 129f4febd00SPatrick Williams- "MinValue": Numeric. The minimum valid value for this sensor. Although not 130f4febd00SPatrick Williams used by _ExternalSensor_ directly, it is a valuable hint for services such as 131f4febd00SPatrick Williams IPMI, which need to know the minimum and maximum valid sensor values in order 132f4febd00SPatrick Williams to scale their reporting range accurately. As _ExternalSensor_ is not tied to 133f4febd00SPatrick Williams one particular physical quantity, there is no suitable default value for 134f4febd00SPatrick Williams minimum and maximum. Thus, unlike other sensor daemons where this parameter is 135f4febd00SPatrick Williams optional, in _ExternalSensor_ it is mandatory. 1360ac32b44SJosh Lehan 137f4febd00SPatrick Williams- "MaxValue": Numeric. The maximum valid value for this sensor. It is treated 1380ac32b44SJosh Lehan similarly to "MinValue". 1390ac32b44SJosh Lehan 140f4febd00SPatrick Williams- "Timeout": Numeric. This parameter is unique to _ExternalSensor_. It is the 141f4febd00SPatrick Williams timeout value, in seconds. If this amount of time elapses with no new updates 142f4febd00SPatrick Williams received over D-Bus from the external source, this sensor will be deemed 143f4febd00SPatrick Williams stale. The value of this sensor will be replaced with floating-point _NaN_, as 144f4febd00SPatrick Williams described above. This field is optional. If not given, the timeout feature 145f4febd00SPatrick Williams will be disabled for this sensor (so it will never be deemed stale). 1460ac32b44SJosh Lehan 147f4febd00SPatrick Williams- "Type": String. Must be exactly "ExternalSensor". This string is used by 1480ac32b44SJosh Lehan _ExternalSensor_ to obtain configuration information from _entity-manager_ 1490ac32b44SJosh Lehan during initialization. This string is what differentiates JSON stanzas 1500ac32b44SJosh Lehan intended for _ExternalSensor_ versus JSON stanzas intended for other 1510ac32b44SJosh Lehan _dbus-sensors_ sensor daemons. 1520ac32b44SJosh Lehan 153f4febd00SPatrick Williams- "Thresholds": JSON dictionary. This field is optional. It is passed through to 154f4febd00SPatrick Williams the main _Sensor_ class during initialization, similar to other sensor 1550ac32b44SJosh Lehan daemons. Other than that, it is not used by _ExternalSensor_. 1560ac32b44SJosh Lehan 157f4febd00SPatrick Williams- "PowerState": String. This field is optional. Similarly to "Thresholds", it is 158f4febd00SPatrick Williams passed through to the main _Sensor_ class during initialization. 1590ac32b44SJosh Lehan 1600ac32b44SJosh LehanHere is an example. The sensor created by this stanza will form this object 1610ac32b44SJosh Lehanpath: /xyz/openbmc_project/sensors/temperature/HostDevTemp 1620ac32b44SJosh Lehan 1630ac32b44SJosh Lehan``` 1640ac32b44SJosh Lehan { 1650ac32b44SJosh Lehan "Name": "HostDevTemp", 1660ac32b44SJosh Lehan "Units": "DegreesC", 1670ac32b44SJosh Lehan "MinValue": -16.0, 1680ac32b44SJosh Lehan "MaxValue": 111.5, 1690ac32b44SJosh Lehan "Timeout": 4.0, 1700ac32b44SJosh Lehan "Type": "ExternalSensor" 1710ac32b44SJosh Lehan }, 1720ac32b44SJosh Lehan``` 1730ac32b44SJosh Lehan 1740ac32b44SJosh LehanThere can be multiple _ExternalSensor_ sensors in the configuration. There is no 1750ac32b44SJosh Lehanset limit on the number of sensors, except what is supported by a service such 1760ac32b44SJosh Lehanas IPMI. 1770ac32b44SJosh Lehan 1780ac32b44SJosh Lehan## Implementation 1790ac32b44SJosh Lehan 1800ac32b44SJosh LehanAs it stands now, _ExternalSensor_ is up and running[^20]. However, the timeout 1810ac32b44SJosh Lehanfeature was originally implemented at the IPMI layer. Upon further 1820ac32b44SJosh Lehaninvestigation, it was found that IPMI was the wrong place for this feature, and 1830ac32b44SJosh Lehanthat it should be moved within _ExternalSensor_ itself[^21]. It was originally 1840ac32b44SJosh Lehanthought that the timeout feature would be a useful enhancement available to all 1850ac32b44SJosh LehanIPMI sensors, however, expected usage of almost all external sensor updates is a 1860ac32b44SJosh Lehanone-shot adjustment (for example, somebody wishes to change a voltage regulator 1870ac32b44SJosh Lehansetting, or fan speed setting). In this case, the timeout feature would not only 1880ac32b44SJosh Lehannot be necessary, it would get in the way and require additional coding[^22] to 1890ac32b44SJosh Lehancompensate for the unexpected _NaN_ value. Only sensors intended for use with 1900ac32b44SJosh Lehan_ExternalSensor_ are expected to receive continuous periodic updates from an 1910ac32b44SJosh Lehanexternal source, so it makes sense to move this timeout feature into 1920ac32b44SJosh Lehan_ExternalSensor_. This change also has the advantage of making _ExternalSensor_ 1930ac32b44SJosh Lehannot dependent on IPMI as the only source of external updates. 1940ac32b44SJosh Lehan 1950ac32b44SJosh LehanA challenge of generalizing the timeout feature into _ExternalSensor_, however, 1960ac32b44SJosh Lehanwas that the existing _Sensor_ base class did not currently allow its existing 1970ac32b44SJosh LehanD-Bus setter hook to be customized. This feature was straightforward to 1980ac32b44SJosh Lehanadd[^23]. One limitation was that the existing _Sensor_ class, by design, 1990ac32b44SJosh Lehandropped updates that duplicated the existing sensor value. For use with 2000ac32b44SJosh Lehan_ExternalSensor_, we want to recognize all updates received, even duplicates, as 2010ac32b44SJosh Lehanthey are important to pet the watchdog, to avoid inadvertently triggering the 2020ac32b44SJosh Lehantimeout feature. However, it is still important to avoid needlessly sending the 2030ac32b44SJosh LehanD-Bus _PropertiesChanged_ event for duplicate readings. 2040ac32b44SJosh Lehan 2050ac32b44SJosh LehanThe timeout value was originally a compiled-in constant. If _ExternalSensor_ is 2060ac32b44SJosh Lehanto succeed as a general-purpose tool, this must be configurable. It was 2070ac32b44SJosh Lehanstraightforward to add another configurable parameter[^24] to accept this 2080ac32b44SJosh Lehantimeout value, as shown in "Parameters" above. 2090ac32b44SJosh Lehan 2100ac32b44SJosh LehanThe hardest task of all, however, was getting it accepted upstream. If you are 2110ac32b44SJosh Lehanreading this, then most likely, it was successful! 2120ac32b44SJosh Lehan 2130ac32b44SJosh Lehan## Footnotes 2140ac32b44SJosh Lehan 2150ee8da09SNodeMan97[^1]: https://gerrit.openbmc.org/q/owner:krellan%2540google.com 216*85706020SAndrew Geissler 2170ac32b44SJosh Lehan[^2]: https://github.com/openbmc/dbus-sensors/blob/master/README.md 218*85706020SAndrew Geissler 2190ac32b44SJosh Lehan[^3]: https://www.kernel.org/doc/html/latest/hwmon/index.html 220*85706020SAndrew Geissler 2210ac32b44SJosh Lehan[^4]: https://github.com/openbmc/entity-manager/blob/master/README.md 222*85706020SAndrew Geissler 2230ee8da09SNodeMan97[^5]: https://gerrit.openbmc.org/c/openbmc/dbus-sensors/+/36206 224*85706020SAndrew Geissler 2250ee8da09SNodeMan97[^6]: https://gerrit.openbmc.org/c/openbmc/dbus-sensors/+/35476 226*85706020SAndrew Geissler 2270ac32b44SJosh Lehan[^7]: https://github.com/openbmc/dbus-sensors/blob/master/src/HwmonTempMain.cpp 228*85706020SAndrew Geissler 2290ac32b44SJosh Lehan[^8]: https://think-async.com/Asio/ 230*85706020SAndrew Geissler 2310ac32b44SJosh Lehan[^9]: https://github.com/openbmc/dbus-sensors/blob/master/include/sensor.hpp 232*85706020SAndrew Geissler 233f4febd00SPatrick Williams[^10]: 234f4febd00SPatrick Williams https://github.com/openbmc/docs/blob/master/architecture/ipmi-architecture.md 235f4febd00SPatrick Williams 236f4febd00SPatrick Williams[^11]: 237f4febd00SPatrick Williams https://www.intel.com/content/www/us/en/servers/ipmi/ipmi-intelligent-platform-mgt-interface-spec-2nd-gen-v2-0-spec-update.html 238f4febd00SPatrick Williams 2390ac32b44SJosh Lehan[^12]: https://www.dmtf.org/standards/redfish 240*85706020SAndrew Geissler 241f4febd00SPatrick Williams[^13]: 242f4febd00SPatrick Williams https://www.boost.org/doc/libs/1_75_0/doc/html/boost_asio/overview/timers.html 243f4febd00SPatrick Williams 2440ac32b44SJosh Lehan[^14]: https://anniecherkaev.com/the-secret-life-of-nan 245*85706020SAndrew Geissler 246f4febd00SPatrick Williams[^15]: 247f4febd00SPatrick Williams https://github.com/openbmc/phosphor-dbus-interfaces/blob/master/yaml/xyz/openbmc_project/Sensor/Value.interface.yaml 248f4febd00SPatrick Williams 2490ac32b44SJosh Lehan[^16]: https://cr.yp.to/proto/utctai.html 250*85706020SAndrew Geissler 2510ac32b44SJosh Lehan[^17]: https://github.com/openbmc/openbmc/issues/1892 252*85706020SAndrew Geissler 253f4febd00SPatrick Williams[^18]: 254f4febd00SPatrick Williams https://github.com/openbmc/entity-manager/blob/master/docs/my_first_sensors.md 255f4febd00SPatrick Williams 2560ac32b44SJosh Lehan[^19]: https://github.com/openbmc/dbus-sensors/blob/master/src/SensorPaths.cpp 257*85706020SAndrew Geissler 2580ee8da09SNodeMan97[^20]: https://gerrit.openbmc.org/c/openbmc/dbus-sensors/+/36206 259*85706020SAndrew Geissler 2600ee8da09SNodeMan97[^21]: https://gerrit.openbmc.org/c/openbmc/dbus-sensors/+/41398 261*85706020SAndrew Geissler 2620ee8da09SNodeMan97[^22]: https://gerrit.openbmc.org/c/openbmc/dbus-sensors/+/39294 263*85706020SAndrew Geissler 2640ee8da09SNodeMan97[^23]: https://gerrit.openbmc.org/c/openbmc/dbus-sensors/+/41394 265*85706020SAndrew Geissler 2660ee8da09SNodeMan97[^24]: https://gerrit.openbmc.org/c/openbmc/entity-manager/+/41397 267*85706020SAndrew Geissler 2680ee8da09SNodeMan97[^25]: https://gerrit.openbmc.org/c/openbmc/dbus-sensors/+/16177 269