xref: /openbmc/docs/designs/power-recovery.md (revision 2f52b0a3)
1____
2# OpenBMC Server Power Recovery
3
4Author: Andrew Geissler (geissonator)
5
6Primary assignee: Andrew Geissler (geissonator)
7
8Other contributors:
9
10Created: October 11th, 2021
11
12## Problem Description
13Modern computer systems have a feature, automated power-on recovery, which
14in essence is the ability to tell your system what to do when it hits
15issues with power to the system. If the system had a black out (i.e. power
16was completely cut to the system), should it automatically power the system
17on? Should it leave it off? Or maybe the user would like the system to
18go to whichever state it was at before the power loss.
19
20The goal of this design document is to describe how OpenBMC firmware will
21deal with these questions.
22
23## Background and References
24The BMC already implements a limited subset of function in this area.
25The [PowerRestorePolicy][pdi-restore] property out in phosphor-dbus-interface
26defines the function capability.
27
28In smaller servers, this feature is commonly found within the Advanced
29Configuration and Power Interface (ACPI).
30
31[openbmc/phosphor-state-manager][state-mgr] supports this property as defined
32in the phosphor-dbus-interface.
33
34Future updates to this document will touch on more complex scenarios like
35brown outs (chassis power loss but BMC remains on), handling of external
36uninterrupted power devices (UPS), and enhanced tracking of the different types
37of errors that can occur in this area on systems.
38
39## Requirements
40OpenBMC software must ensure it persists the state of power to the chassis so
41it can know what to restore it to if necessary
42
43OpenBMC software must provide support for the following options:
44- Do nothing when power is lost to the system (this will be the
45  default)
46- Always power the system on and boot the host
47- Always power the system off (previous power was on, power is now off, run
48  all chassis power off services to ensure a clean state of software and
49  hardware)
50- Restore the previous state of the chassis power and host
51
52These options are only checked and enforced in situations where the BMC does
53not detect that chassis power is already on to the system when it comes out
54of reboot.
55
56OpenBMC software must also support the concept of a one_time power restore
57policy. This is a separate instance of the `PowerRestorePolicy` which will
58be hosted under a D-Bus object path which ends with "one_time". If this
59one_time setting is not the default, `None`, then software will execute
60the policy defined under it, and then reset the one_time property to `None`.
61This one_time feature is a way for software to utilize automated power-on
62recovery function for other areas like firmware update scenarios where a
63certain power on behavior is desired once an update has completed.
64
65## Proposed Design
66An application will be run after the chassis and host states have been
67determined which will only run if the chassis power is not on.
68
69This application will look for the one_time setting and use it if its value
70is not `None`. If it does use the one_time setting then it will reset it
71to `None` once it has read it. Otherwise the application will read the
72persistent value of the `PowerRestorePolicy`. The application will then
73run the logic as defined in the Requirements above.
74
75This function will be hosted in phosphor-state-manger and potentially
76x86-power-control.
77
78## Alternatives Considered
79None, this is a pretty basic feature that does not have a lot of alternatives
80(other then just not doing it).
81
82## Impacts
83None
84
85## Testing
86The control of this policy can already bet set via the Redfish API.
87```
88#  Power Restore Policy
89curl -k -X PATCH -d '{"PowerRestorePolicy":"AlwaysOn"}' https://${bmc}/redfish/v1/Systems/system
90curl -k -X PATCH -d '{"PowerRestorePolicy":"AlwaysOff"}' https://${bmc}/redfish/v1/Systems/system
91curl -k -X PATCH -d '{"PowerRestorePolicy":"LastState"}' https://${bmc}/redfish/v1/Systems/system
92```
93For testing, each policy should be set and verified. The one_time aspect should
94also be checked for each possible value and verified to only be used once.
95
96Validate that when multiple black outs occur, the firmware continues to try
97and power on the system when policy is `AlwaysOn` or `Restore`.
98
99[pdi-restore]:https://github.com/openbmc/phosphor-dbus-interfaces/blob/master/yaml/xyz/openbmc_project/Control/Power/RestorePolicy.interface.yaml
100[state-mgr]: https://github.com/openbmc/phosphor-state-manager
101