xref: /openbmc/docs/host-management.md (revision 99585d46)
1Host Management with OpenBMC
2============================
3
4This document describes the host-management interfaces of the OpenBMC object
5structure, accessible over REST.
6
7Inventory
8---------
9
10The system inventory structure is under the `/xyz/openbmc_project/inventory` hierarchy.
11
12In OpenBMC the inventory is represented as a path which is hierarchical to the
13physical system topology. Items in the inventory are referred to as inventory
14items and are not necessarily FRUs (field-replaceable units). If the system
15contains one chassis, a motherboard, and a CPU on the motherboard, then the
16path to that inventory item would be:
17
18   inventory/system/chassis0/motherboard0/cpu0
19
20The properties associated with an inventory item are specific to that item.
21Some common properties are:
22
23 * `Version`: A code version associated with this item.
24 * `Present`: Indicates whether this item is present in the system (True/False).
25 * `Functional`: Indicates whether this item is functioning in the system (True/False).
26
27The usual `list` and `enumerate` REST queries allow the system inventory
28structure to be accessed. For example, to enumerate all inventory items and
29their properties:
30
31    curl -b cjar -k https://${bmc}/xyz/openbmc_project/inventory/enumerate
32
33To list the properties of one item:
34
35    curl -b cjar -k https://${bmc}/xyz/openbmc_project/inventory/system/chassis/motherboard
36
37Sensors
38-------
39
40The system sensor structure is under the `/xyz/openbmc_project/sensors`
41hierarchy.
42
43This interface allows monitoring of system attributes like temperature or
44altitude, and are represented similar to the inventory, by object paths under
45the top-level `sensors` object name. The path categorizes the sensor and shows
46what the sensor represents, but does not necessarily represent the physical
47topology of the system.
48
49For example, all temperature sensors are under `sensors/temperature`. CPU
50temperature sensors would be `sensors/temperature/cpu[n]`.
51
52These are some common properties:
53
54 * `Value`: Current value of the sensor
55 * `Unit`: Unit of the value and "Critical" and "Warning" values
56 * `Scale`: The scale of the value and "Critical" and "Warning" values
57 * `CriticalHigh` & `CriticalLow`: Sensor device upper/lower critical threshold
58bound
59 * `CriticalAlarmHigh` & `CriticalAlarmLow`: True if the sensor has exceeded the
60critical threshold bound
61 * `WarningHigh` & `WarningLow`: Sensor device upper/lower warning threshold
62bound
63 * `WarningAlarmHigh` & `WarningAlarmLow`: True if the sensor has exceeded the
64warning threshold bound
65
66A temperature sensor might look like:
67
68    curl -b cjar -k https://${bmc}/xyz/openbmc_project/sensors/temperature/pcie
69    {
70      "data": {
71        "CriticalAlarmHigh": 0,
72        "CriticalAlarmLow": 0,
73        "CriticalHigh": 70000,
74        "CriticalLow": 0,
75        "Scale": -3,
76        "Unit": "xyz.openbmc_project.Sensor.Value.Unit.DegreesC",
77        "Value": 28187,
78        "WarningAlarmHigh": 0,
79        "WarningAlarmLow": 0,
80        "WarningHigh": 60000,
81        "WarningLow": 0
82      },
83      "message": "200 OK",
84      "status": "ok"
85    }
86
87Note the value of this sensor is 28.187C (28187 * 10^-3).
88
89Unlike IPMI, there are no "functional" sensors in OpenBMC; functional states are
90represented in the inventory.
91
92To enumerate all sensors in the system:
93
94    curl -b cjar -k https://${bmc}/xyz/openbmc_project/sensors/enumerate
95
96List properties of one inventory item:
97
98    curl -b cjar -k https://${bmc}/xyz/openbmc_project/sensors/temperature/ambient
99
100Event Logs
101----------
102
103The event log structure is under the `/xyz/openbmc_project/logging/entry`
104hierarchy. Each event is a separate object under this structure, referenced by
105number.
106
107BMC and host firmware on POWER-based servers can report event logs to the BMC.
108Typically, these event logs are reported in cases where host firmware cannot
109start the OS, or cannot reliably log to the OS.
110
111The properties associated with an event log are as follows:
112
113 * `Message`: The type of event log (e.g. "xyz.openbmc_project.Inventory.Error.NotPresent").
114 * `Resolved` : Indicates whether the event has been resolved.
115 * `Severity`: The level of problem ("Info", "Error", etc.).
116 * `Timestamp`: The date of the event log in epoch time.
117 * `associations`: A URI to the failing inventory part.
118
119To list all reported event logs:
120
121    $ curl -b cjar -k https://${bmc}/xyz/openbmc_project/logging/entry/
122    {
123      "data": [
124        "/xyz/openbmc_project/logging/entry/3",
125        "/xyz/openbmc_project/logging/entry/2",
126        "/xyz/openbmc_project/logging/entry/1",
127        "/xyz/openbmc_project/logging/entry/7",
128        "/xyz/openbmc_project/logging/entry/6",
129        "/xyz/openbmc_project/logging/entry/5",
130        "/xyz/openbmc_project/logging/entry/4"
131      ],
132      "message": "200 OK",
133      "status": "ok"
134    }
135
136To read a specific event log:
137
138    $ curl -b cjar -k https://${bmc}/xyz/openbmc_project/logging/entry/1
139    {
140      "data": {
141        "AdditionalData": [
142          "CALLOUT_INVENTORY_PATH=/xyz/openbmc_project/inventory/system/chassis/powersupply0",
143          "_PID=1136"
144        ],
145        "Id": 1,
146        "Message": "xyz.openbmc_project.Inventory.Error.NotPresent",
147        "Resolved": 0,
148        "Severity": "xyz.openbmc_project.Logging.Entry.Level.Error",
149        "Timestamp": 1512154612660,
150        "associations": [
151          [
152            "callout",
153            "fault",
154            "/xyz/openbmc_project/inventory/system/chassis/powersupply0"
155          ]
156        ]
157      },
158      "message": "200 OK",
159      "status": "ok"
160    }
161
162To delete an event log (log 1 in this example), call the `delete` method on the event:
163
164    curl -b cjar -k -H "Content-Type: application/json" -X POST \
165        -d '{"data" : []}' \
166        https://${bmc}/xyz/openbmc_project/logging/entry/1/action/Delete
167
168To clear all event logs, call the top-level `DeleteAll` method:
169
170    curl -b cjar -k -H "Content-Type: application/json" -X POST \
171        -d '{"data" : []}' \
172        https://${bmc}/xyz/openbmc_project/logging/action/DeleteAll
173
174Host Boot Options
175-----------------
176
177With OpenBMC, the Host boot options are stored as D-Bus properties under the
178`control/host0/boot` path. Properties include
179[`BootMode`](https://github.com/openbmc/phosphor-dbus-interfaces/blob/master/xyz/openbmc_project/Control/Boot/Mode.interface.yaml)
180and
181[`BootSource`](https://github.com/openbmc/phosphor-dbus-interfaces/blob/master/xyz/openbmc_project/Control/Boot/Source.interface.yaml).
182
183
184Host State Control
185------------------
186
187The host can be controlled through the `host` object. The object implements a
188number of actions including power on and power off. These correspond to the IPMI
189`power on` and `power off` commands.
190
191Assuming you have logged in, the following will power on the host:
192
193```
194curl -c cjar -b cjar -k -H "Content-Type: application/json" -X PUT \
195  -d '{"data": "xyz.openbmc_project.State.Host.Transition.On"}' \
196  https://${bmc}/xyz/openbmc_project/state/host0/attr/RequestedHostTransition
197```
198
199To power off the host:
200
201```
202curl -c cjar -b cjar -k -H "Content-Type: application/json" -X PUT \
203  -d '{"data": "xyz.openbmc_project.State.Host.Transition.Off"}' \
204  https://${bmc}/xyz/openbmc_project/state/host0/attr/RequestedHostTransition
205```
206
207To issue a hard power off (accomplished by powering off the chassis):
208
209```
210curl -c cjar -b cjar -k -H "Content-Type: application/json" -X PUT \
211  -d '{"data": "xyz.openbmc_project.State.Chassis.Transition.Off"}' \
212  https://${bmc}/xyz/openbmc_project/state/chassis0/attr/RequestedPowerTransition
213```
214
215To reboot the host:
216
217```
218curl -c cjar -b cjar -k -H "Content-Type: application/json" -X PUT \
219  -d '{"data": "xyz.openbmc_project.State.Host.Transition.Reboot"}' \
220  https://${bmc}/xyz/openbmc_project/state/host0/attr/RequestedHostTransition
221```
222
223
224More information about Host State Management can be found here:
225https://github.com/openbmc/phosphor-dbus-interfaces/tree/master/xyz/openbmc_project/State
226
227Host Clear GARD
228---------------
229
230On OpenPOWER systems, the host maintains a record of bad or non-working
231components on the GARD partition. This record is referenced by the host on
232subsequent boots to determine which parts should be ignored.
233
234The BMC implements a function that simply clears this partition. This function
235can be called as follows:
236
237  * Method 1: From the BMC command line:
238
239      ```
240      busctl call org.open_power.Software.Host.Updater \
241        /org/open_power/control/gard \
242        xyz.openbmc_project.Common.FactoryReset Reset
243      ```
244
245  * Method 2: Using the REST API:
246
247      ```
248      curl -b cjar -k -H 'Content-Type: application/json' -X POST -d '{"data":[]}' https://${bmc}/org/open_power/control/gard/action/Reset
249      ```
250
251Implementation: https://github.com/openbmc/openpower-pnor-code-mgmt
252
253Host Watchdog
254-------------
255
256The host watchdog service is responsible for ensuring the host starts and boots
257within a reasonable time. On host start, the watchdog is started and it is
258expected that the host will ping the watchdog via the inband interface
259periodically as it boots. If the host fails to ping the watchdog within the
260timeout then the host watchdog will start a systemd target to go to the quiesce
261target. System settings will then determine the recovery behavior from that
262state, for example, attempting to reboot the system.
263
264The host watchdog utilizes the generic [phosphor-watchdog][1] repository. The
265host watchdog service provides 2 files as configuration options into
266phosphor-watchdog:
267
268    /lib/systemd/system/phosphor-watchdog@poweron.service.d/poweron.conf
269    /etc/default/obmc/watchdog/poweron
270
271`poweron.conf` contains the "Conflicts" relationships to ensure the watchdog
272service is stopped at the correct times. `poweron` contains the required
273information for phosphor-watchdog (more information on these can be found in the
274[phosphor-watchdog][1] repository).
275
276The 2 service files involved with the host watchdog are:
277
278    phosphor-watchdog@poweron.service
279    obmc-enable-host-watchdog@0.service
280
281`phosphor-watchdog@poweron` starts the host watchdog service and
282`obmc-enable-host-watchdog` starts the watchdog timer. Both are run as a part
283of the `obmc-host-startmin@.target`. Service dependencies ensure the service is
284started before the enable is called.
285
286The default watchdog timeout can be found within the [dbus interface
287specification][2] (Interval property).
288
289The host controls the watchdog timeout and enable/disable once it starts.
290
291[1]: https://github.com/openbmc/phosphor-watchdog
292[2]: https://github.com/openbmc/phosphor-dbus-interfaces/blob/master/xyz/openbmc_project/State/Watchdog.interface.yaml
293