# ExternalSensor in dbus-sensors

Author: Josh Lehan[^1]

Other contributors: Ed Tanous, Peter Lundgren, Alex Qiu

Created: March 19, 2021

## Introduction

In OpenBMC, the _dbus-sensors_[^2] package contains a suite of sensor daemons.
Each daemon monitors a particular type of sensor. This document provides
rationale and motivation for adding _ExternalSensor_, another sensor daemon, and
gives some example usages of it.

## Motivation

There are 10 existing sensor daemons in _dbus-sensors_. Why add another sensor
daemon?

- Most of the existing sensor daemons are tied to one particular physical
  quantity they are measuring, such as temperature, and are hardcoded as such.
  An externally-updated sensor has no such limitation, and should be flexible
  enough to measure any physical quantity currently supported by OpenBMC.

- Essentially all of the existing sensor daemons obtain the sensor values they
  publish to D-Bus by reading from local hardware (typically by reading from
  virtual files provided by the _hwmon_[^3] subsystem of the Linux kernel). None
  of the daemons are currently designed with the intention of accepting values
  pushed in from an external source. Although there is some debugging
  functionality to add this feature to other sensor daemons[^25], it is not the
  primary purpose for which they were designed.

- Even if the debugging functionality of an existing daemon were to be used, the
  daemon would still need a valid configuration tied to recognized hardware, as
  detected by _entity-manager_[^4], in order for the daemon to properly
  initialize itself and participate in the OpenBMC software stack.

- For the same reason it is desirable for existing sensor daemons to detect and
  properly indicate failures of their underlying hardware, it is desirable for
  _ExternalSensor_ to detect and properly indicate loss of timely sensor updates
  from their external source. This is a new feature, and does not cleanly fit
  into the architecture of any existing sensor daemon, thus a new daemon is the
  correct choice for this behavior.

For these reasons, _ExternalSensor_ has been added[^5], as the eleventh sensor
daemon in _dbus-sensors_.

## Design

After some discussion, a proof-of-concept _HostSensor_[^6] was published. This
was a stub, but it revealed the minimal implementation that would still be
capable of fully initializing and participating in the OpenBMC software stack.
_ExternalSensor_ was formed by using this example _HostSensor_, and also one of
the simplest existing sensor daemons, _HwmonTempSensor_[^7], as references to
build upon.

As written, after validating parameters during initialization, there is
essentially no work for _ExternalSensor_ to do. The main loop is mostly idle,
remaining blocked in the Boost ASIO[^8] library, handling D-Bus requests as they
come in. This utilizes the functionality in the underlying _Sensor_[^9] class,
which already contains the D-Bus hooks necessary to receive values from the
external source.

An example external source is the IPMI service[^10], receiving values from the
host via the IPMI "Set Sensor Reading" command[^11]. _ExternalSensor_ is
intended to be source-agnostic, so it does not matter if this is IPMI or
Redfish[^12] or something else in the future, as long as they are received
similarly over D-Bus.

### Timeout

The timeout feature is the primary feature which distinguishes _ExternalSensor_
from other sensor daemons. Once an external source starts providing updates, the
external source is expected to continue to provide timely updates. Each update
will be properly published onto D-Bus, in the usual way done by all sensor
daemons, as a floating-point value.

A timer is used, the same Boost ASIO[^13] timer mechanism used by other sensor
daemons to poll their hardware, but in this case, is used to manage how long it
has been since the last known good external update. When the timer expires, the
sensor value will be deemed stale, and will be replaced with floating-point
quiet _NaN_[^14].

### NaN

The advantage of floating-point _NaN_ is that it is a drop-in replacement for
the valid floating-point value of the sensor. A subtle difference of the earlier
OpenBMC sensor "Value" schema change, from integer to floating-point, is that
the field is essentially now nullable. Instead of having to arbitrarily choose
an arbitrary integer value to indicate "not valid", such as -1 or 9999 or
whatever, floating-point explicitly has _NaN_ to indicate this. So, there is no
possibility of confusion that this will be mistaken for a valid sensor value, as
_NaN_ is literally _not a number_, and thus can not be misparsed as a valid
sensor reading. It thus saves having to add a second field to reliably indicate
validity, which would break the existing schema[^15].

An alternative to using _NaN_ for staleness indication would have been to use a
timestamp, which would introduce the complication of having to parse and compare
timestamps within OpenBMC, and all the subtle difficulties thereof[^16]. What's
more, adding a second field might require a second D-Bus message to update, and
D-Bus messages are computationally expensive[^17] and should be used sparingly.
Periodic things like sensors, which send out regular updates, could easily lead
to frequent D-Bus traffic and thus should be kept as minimal as practical. And
finally, changing the Value schema would cause a large blast radius, both in
design and in code, necessitating a large refactoring effort well beyond the
scope of what is needed for _ExternalSensor_.

### Configuration

Configuring a sensor for use with _ExternalSensor_ should be done in the usual
way[^18] that is done for use with other sensor daemons, namely, a JSON
dictionary that is an element of the "Exposes" array within a JSON configuration
file to be read by _entity-manager_. In that JSON dictionary, the valid names
are listed below. All of these are mandatory parameters, unless mentioned as
optional. For fields listed as "Numeric" below, this means that it can be either
integer or valid floating-point.

- "Name": String. The sensor name, which this sensor will be known as. A
  mandatory component of the `entity-manager` configuration, and the resulting
  D-Bus object path.

- "Units": String. This parameter is unique to _ExternalSensor_. As
  _ExternalSensor_ is not tied to any particular physical hardware, it can
  measure any physical quantity supported by OpenBMC. This string will be
  translated to another string via a lookup table[^19], and forms another
  mandatory component of the D-Bus object path.

- "MinValue": Numeric. The minimum valid value for this sensor. Although not
  used by _ExternalSensor_ directly, it is a valuable hint for services such as
  IPMI, which need to know the minimum and maximum valid sensor values in order
  to scale their reporting range accurately. As _ExternalSensor_ is not tied to
  one particular physical quantity, there is no suitable default value for
  minimum and maximum. Thus, unlike other sensor daemons where this parameter is
  optional, in _ExternalSensor_ it is mandatory.

- "MaxValue": Numeric. The maximum valid value for this sensor. It is treated
  similarly to "MinValue".

- "Timeout": Numeric. This parameter is unique to _ExternalSensor_. It is the
  timeout value, in seconds. If this amount of time elapses with no new updates
  received over D-Bus from the external source, this sensor will be deemed
  stale. The value of this sensor will be replaced with floating-point _NaN_, as
  described above. This field is optional. If not given, the timeout feature
  will be disabled for this sensor (so it will never be deemed stale).

- "Type": String. Must be exactly "ExternalSensor". This string is used by
  _ExternalSensor_ to obtain configuration information from _entity-manager_
  during initialization. This string is what differentiates JSON stanzas
  intended for _ExternalSensor_ versus JSON stanzas intended for other
  _dbus-sensors_ sensor daemons.

- "Thresholds": JSON dictionary. This field is optional. It is passed through to
  the main _Sensor_ class during initialization, similar to other sensor
  daemons. Other than that, it is not used by _ExternalSensor_.

- "PowerState": String. This field is optional. Similarly to "Thresholds", it is
  passed through to the main _Sensor_ class during initialization.

Here is an example. The sensor created by this stanza will form this object
path: /xyz/openbmc_project/sensors/temperature/HostDevTemp

```
        {
            "Name": "HostDevTemp",
            "Units": "DegreesC",
            "MinValue": -16.0,
            "MaxValue": 111.5,
            "Timeout": 4.0,
            "Type": "ExternalSensor"
        },
```

There can be multiple _ExternalSensor_ sensors in the configuration. There is no
set limit on the number of sensors, except what is supported by a service such
as IPMI.

## Implementation

As it stands now, _ExternalSensor_ is up and running[^20]. However, the timeout
feature was originally implemented at the IPMI layer. Upon further
investigation, it was found that IPMI was the wrong place for this feature, and
that it should be moved within _ExternalSensor_ itself[^21]. It was originally
thought that the timeout feature would be a useful enhancement available to all
IPMI sensors, however, expected usage of almost all external sensor updates is a
one-shot adjustment (for example, somebody wishes to change a voltage regulator
setting, or fan speed setting). In this case, the timeout feature would not only
not be necessary, it would get in the way and require additional coding[^22] to
compensate for the unexpected _NaN_ value. Only sensors intended for use with
_ExternalSensor_ are expected to receive continuous periodic updates from an
external source, so it makes sense to move this timeout feature into
_ExternalSensor_. This change also has the advantage of making _ExternalSensor_
not dependent on IPMI as the only source of external updates.

A challenge of generalizing the timeout feature into _ExternalSensor_, however,
was that the existing _Sensor_ base class did not currently allow its existing
D-Bus setter hook to be customized. This feature was straightforward to
add[^23]. One limitation was that the existing _Sensor_ class, by design,
dropped updates that duplicated the existing sensor value. For use with
_ExternalSensor_, we want to recognize all updates received, even duplicates, as
they are important to pet the watchdog, to avoid inadvertently triggering the
timeout feature. However, it is still important to avoid needlessly sending the
D-Bus _PropertiesChanged_ event for duplicate readings.

The timeout value was originally a compiled-in constant. If _ExternalSensor_ is
to succeed as a general-purpose tool, this must be configurable. It was
straightforward to add another configurable parameter[^24] to accept this
timeout value, as shown in "Parameters" above.

The hardest task of all, however, was getting it accepted upstream. If you are
reading this, then most likely, it was successful!

## Footnotes

[^1]: https://gerrit.openbmc.org/q/owner:krellan%2540google.com

[^2]: https://github.com/openbmc/dbus-sensors/blob/master/README.md

[^3]: https://www.kernel.org/doc/html/latest/hwmon/index.html

[^4]: https://github.com/openbmc/entity-manager/blob/master/README.md

[^5]: https://gerrit.openbmc.org/c/openbmc/dbus-sensors/+/36206

[^6]: https://gerrit.openbmc.org/c/openbmc/dbus-sensors/+/35476

[^7]: https://github.com/openbmc/dbus-sensors/blob/master/src/HwmonTempMain.cpp

[^8]: https://think-async.com/Asio/

[^9]: https://github.com/openbmc/dbus-sensors/blob/master/include/sensor.hpp

[^10]:
    https://github.com/openbmc/docs/blob/master/architecture/ipmi-architecture.md

[^11]:
    https://www.intel.com/content/www/us/en/servers/ipmi/ipmi-intelligent-platform-mgt-interface-spec-2nd-gen-v2-0-spec-update.html

[^12]: https://www.dmtf.org/standards/redfish

[^13]:
    https://www.boost.org/doc/libs/1_75_0/doc/html/boost_asio/overview/timers.html

[^14]: https://anniecherkaev.com/the-secret-life-of-nan

[^15]:
    https://github.com/openbmc/phosphor-dbus-interfaces/blob/master/yaml/xyz/openbmc_project/Sensor/Value.interface.yaml

[^16]: https://cr.yp.to/proto/utctai.html

[^17]: https://github.com/openbmc/openbmc/issues/1892

[^18]:
    https://github.com/openbmc/entity-manager/blob/master/docs/my_first_sensors.md

[^19]: https://github.com/openbmc/dbus-sensors/blob/master/src/SensorPaths.cpp

[^20]: https://gerrit.openbmc.org/c/openbmc/dbus-sensors/+/36206

[^21]: https://gerrit.openbmc.org/c/openbmc/dbus-sensors/+/41398

[^22]: https://gerrit.openbmc.org/c/openbmc/dbus-sensors/+/39294

[^23]: https://gerrit.openbmc.org/c/openbmc/dbus-sensors/+/41394

[^24]: https://gerrit.openbmc.org/c/openbmc/entity-manager/+/41397

[^25]: https://gerrit.openbmc.org/c/openbmc/dbus-sensors/+/16177