1### BMC Health Monitor
2
3Author: Vijay Khemka <vijaykhemka@fb.com>, Sui Chen <suichen@google.com>, Jagpal
4Singh Gill <paligill@gmail.com>
5
6Created: 2020-05-04
7
8## Problem Description
9
10The problem is to monitor the health of a system with a BMC so we have some
11means to make sure the BMC is working correctly. User can get required metrics
12data as per configurations instantly. Set of monitored metrics may include CPU
13and memory utilization, uptime, free disk space, I2C bus stats, and so on.
14Actions can be taken based on monitoring data to correct the BMC’s state.
15
16For this purpose, there may exist a metric producer (the subject of discussion
17of this document), and a metric consumer (a program that makes use of health
18monitoring data, which may run on the BMC or on the host.) They perform the
19following tasks:
20
211. Configuration, where the user specifies what and how to collect, thresholds,
22   etc.
232. Metric collection, similar to what the read routine in phosphor-hwmon-readd
24   does.
253. Metric staging. When metrics are collected, they will be ready to be read
26   anytime in accessible forms like DBus objects or raw files for use with
27   consumer programs. Because of this staging step, the consumer does not need
28   to poll and wait.
294. Data transfer, where the consumer program obtains the metrics from the BMC by
30   in-band or out-of-band methods.
315. The consumer program may take certain actions based on the metrics collected.
32
33Among those tasks, 1), 2), and 3) are the producer’s responsibility. 4) is
34accomplished by both the producer and consumer. 5) is up to the consumer.
35
36We realize there is some overlap between sensors and health monitoring in terms
37of design rationale and existing infrastructure. But there are also quite a few
38differences between sensors and metrics:
39
401. Sensor data originate from hardware, while most metrics may be obtained
41   through software. For this reason, there may be more commonalities between
42   metrics on all kinds of BMCs than sensors on BMCs, and we might not need the
43   hardware discovery process or build-time, hardware-specific configuration for
44   most health metrics.
452. Most sensors are instantaneous readings, while metrics might accumulate over
46   time, such as “uptime”. For those metrics, we might want to do calculations
47   that do not apply to sensor readings.
483. Metrics can represent device attributes which don't change, for example,
49   total system memory which is constant. Contrary, the primary intention of
50   sensors is to sense the change in attributes and represent that variability.
514. Metrics are expressed in native units such as bytes for memory. Sensors
52   infrastructure doesn't adhere to this and community has rejected the proposal
53   to add bytes for sensor unit.
54
55Based on above, it doesn't sound reasonable to use sensors for representing the
56metrics data.
57
58## Background and References
59
60References: dbus-monitor
61
62## Requirements
63
64The metric producer should provide
65
66- A daemon to periodically collect various health metrics and expose them on
67  DBus.
68- A dbus interface to allow other services, like redfish and IPMI, to access its
69  data.
70- Capability to configure health monitoring for wide variety of metrics, such as
71  Memory Utilization, CPU Utilization, Reboot Statistics, etc.
72- Capability to provide granular details for various metric types, for example -
73  - Memory Utilization - Free Memory, Shared Memory, Buffered&CachedMemory, etc.
74  - CPU Utilization - Userspace CPU Utilization, Kernelspace CPU Utilization,
75    etc.
76  - Reboot Statistics - Normal reboot count, Reboot count with failures, etc.
77- Capability to take action as configured when values crosses threshold.
78- Optionally, maintain a certain amount of historical data.
79- Optionally, log critical / warning messages.
80
81The metric consumer may be written in various different ways. No matter how the
82consumer is obtained, it should be able to obtain the health metrics from the
83producer through a set of interfaces.
84
85The metric consumer is not in the scope of this document.
86
87## Proposed Design
88
89The metric producer is a daemon running on the BMC that performs the required
90tasks and meets the requirements above. As described above, it is responsible
91for
92
931. Configuration
942. Metric collection and
953. Metric staging & disperse tasks
96
97For 1) Configuration, the daemon will have a default in code configuration.
98Platform may supply a configuration file if it wants to over-ride the specific
99default attributes. The format for the JSON configuration file is as under -
100
101```json
102  "kernel" : {
103    "Frequency" : 1,
104    "Window_size": 120,
105    "Type": "CPU",
106    "Threshold":
107    {
108        "Critical":
109        {
110            "Value": 90.0,
111            "Log": true,
112            "Target": "reboot.target"
113        },
114        "Warning":
115        {
116          "Value": 80.0,
117          "Log": false,
118          "Target": "systemd unit file"
119        }
120    }
121  },
122  "available" : {
123    "Frequency" : 1,
124    "Window_size": 120,
125    "Type": "Memory",
126    "Threshold":
127    {
128        "Critical":
129        {
130            "Value": 90.0,
131            "Log": true,
132            "Target": "reboot.target"
133        }
134    }
135  }
136```
137
138Frequency : It is time in second when these data are collected in regular
139interval. Window_size: This is a value for number of samples taken to average
140out usage of system rather than taking a spike in usage data. Log : A boolean
141value which allows to log an alert. This field is an optional with default value
142for this in critical is 'true' and in warning it is 'false'. Target : This is a
143systemd target unit file which will called once value crosses its threshold and
144it is optional. Type: This indicates the type of configuration entry. Possible
145values are Memory, CPU, Reboot, Storage.
146
147For 2) Metric collection, this will be done by running certain functions within
148the daemon, as opposed to launching external programs and shell scripts. This is
149due to performance and security considerations.
150
151For 3) Metric staging & disperse, the daemon creates a D-bus service named
152"xyz.openbmc_project.HealthManager". The design proposes new
153[Metrics Dbus interfaces](https://gerrit.openbmc.org/c/openbmc/phosphor-dbus-interfaces/+/64914).
154
155| Interface Name                       | Purpose                                                                        | Required/Optional |
156| :----------------------------------- | :----------------------------------------------------------------------------- | :---------------- |
157| xyz.openbmc_project.Metric.Value     | Interface to represent value for Metrics.                                      | Required          |
158| xyz.openbmc_project.Metric.Reset     | Interface to reset persistent Metrics counters.                                | Optional          |
159| xyz.openbmc_project.Common.Threshold | Interface to represent Metric thresholds and signals for threshold violations. | Optional          |
160| xyz.openbmc_project.Time.EpochTime   | Interface to indicate when the metric was collected.                           | Optional          |
161
162Each metric will be exposed on a specific object path and above interfaces will
163be implemented at these paths.
164
165```
166/xyz/openbmc_project
167    |- /xyz/openbmc_project/metric/bmc/memory/total
168    |- /xyz/openbmc_project/metric/bmc/memory/free
169    |- /xyz/openbmc_project/metric/bmc/memory/available
170    |- /xyz/openbmc_project/metric/bmc/memory/shared
171    |- /xyz/openbmc_project/metric/bmc/memory/buffered_and_cached
172    |- /xyz/openbmc_project/metric/bmc/cpu/user
173    |- /xyz/openbmc_project/metric/bmc/cpu/kernel
174    |- /xyz/openbmc_project/metric/bmc/reboot/count
175    |- /xyz/openbmc_project/metric/bmc/reboot/count_with_failure
176```
177
178Servers for Metrics Data
179
180| Interface Name     | Interface Server        | Info Source                                            |
181| :----------------- | :---------------------- | :----------------------------------------------------- |
182| Memory Utilization | phosphor-health-manager | /proc/meminfo                                          |
183| CPU Utilization    | phosphor-health-manager | /proc/stat                                             |
184| Reboot Statistics  | phosphor-state-manager  | Persistent counters incremented based on reboot status |
185
186Multiple devices of same type -
187
188In case there are multiple devices of same type, the D-Bus path can be extended
189to add context about **"which device"**. For example -
190
191```
192/xyz/openbmc_project/metric/device-0/memory/total
193/xyz/openbmc_project/metric/device-1/memory/total
194...
195```
196
197These paths can be hosted by different daemons, for example, pldmd can host DBus
198paths for BICs if master BMC uses PLDM to communicate with BIC. The Value
199interface for each metric would need to be associated with the appropriate
200system inventory item.
201
202## Alternatives Considered
203
204We have tried doing health monitoring completely within the IPMI Blob framework.
205In comparison, having the metric collection part a separate daemon is better for
206supporting more interfaces.
207
208We have also tried doing the metric collection task by running an external
209binary as well as a shell script. It turns out running shell script is too slow,
210while running an external program might have security concerns (in that the 3rd
211party program will need to be verified to be safe).
212
213Collected: Collectd provides multiple plugins which allows to gather wide
214variety of metrics from various sources and provides mechanisms to store them in
215different ways. For exposing these metrics to DBus, a Collectd C plugin can be
216written.
217
218Pros:
219
220- Off the shelf tool with support for lot of metrics.
221
222Cons:
223
224- Due to support for wide variety of systems (Linux, Solaris, OpenBSD, MacOSX,
225  AIX, etc) and applications, the amount of code for each Collected plugin is
226  pretty significant. Given the amount of functionality needed for openBMC,
227  Collectd seems heavyweight. Majority of phosphor-health-monitor code will be
228  around exposing the metrics on Dbus which will also be needed for Collectd
229  plugin. Hence, directly reading from /proc/<fileX> seems lightweight as code
230  already exist for it.
231- Collected has minimal support for threshold monitoring and doesn't allow
232  starting systemd services on threshold violations.
233
234## Future Enhancements
235
236Extend Metrics Dbus interface for -
237
238- Storage
239- Inodes
240- Port/Network Statistics
241- BMC Daemon Statistics
242
243## Impacts
244
245Most of what the Health Monitoring Daemon does is to do metric collection and
246update DBus objects. The impacts of the daemon itself should be small.
247
248The proposed design changes the DBus interface from Sensors to Metrics, so
249following daemons would need to refactored/updated to account for interface
250change -
251
252- [BMCWeb](https://github.com/openbmc/bmcweb/blob/master/redfish-core/lib/manager_diagnostic_data.hpp)
253- [phosphor-host-ipmid](https://grok.openbmc.org/xref/openbmc/openbmc/meta-quanta/meta-s6q/recipes-phosphor/configuration/s6q-yaml-config/ipmi-sensors.yaml?r=e4f3792f#82)
254
255## Organizational
256
257### Does this design require a new repository?
258
259No, changes will go into phosphor-health-monitor.
260
261### Which repositories are expected to be modified to execute this design?
262
263- phosphor-health-monitor
264- phosphor-state-manager
265- BMCWeb
266- phosphor-host-ipmid
267
268## Testing
269
270### Unit Testing
271
272To verify the daemon is functioning correctly, monitor the DBus traffic
273generated by the Daemon and the metric values from Daemon’s DBus objects.
274Automated unit testing will be covered via GTest.
275
276### Integration Testing
277
278Manual end to end testing can be performed via Redfish GET for
279ManagerDiagnosticData. The end to end automated testing will be covered using
280openbmc-test-automation. To verify the performance aspect, we can stress-test
281the Daemon’s DBus interfaces to make sure the interfaces do not cause a high
282overhead.
283