1# BMC Boot Ready 2 3Author: Andrew Geissler (geissonator) 4 5Other contributors: 6 7Created: May 12, 2022 8 9## Problem Description 10 11There are services which run on the BMC which are required for the BIOS (host 12firmware) to power on and boot the system. The goal of this design is to define 13a mechanism to ensure these dependencies are met before a power on or boot is 14started. 15 16For example, on some system, you can not power on the chassis until the VPD has 17been collected from the VRM's by the BMC to determine their characteristics. On 18other systems, the BIOS service is needed so the host firmware can look for any 19overrides. 20 21Currently, OpenBMC has an undefined behavior in this area. If a particular BMC 22has a large time gap between when the webserver is available and when all BMC 23services have completed running, there is a window there that a user could 24request a power on via the webserver when not all needed services are running. 25 26## Background and References 27 28The mailing list discussion can be found [here][1]. The BMC currently has three 29major [state][2] management interfaces in a system. The BMC, Chassis, and Host. 30Within each state interface, the current state and requested state are tracked. 31 32The [BMC][3] state object is considered `Ready` once the systemd 33`multi-user.target` has successfully started all if its services. 34 35There are three options that have been discussed to solve this issue: 36 371. D-Bus objects don't exist until the backend is prepared to handle them. 382. If a user tries something that system is not in proper state to handle then 39 return an error code. 403. If a user tries something that system is not in proper state to handle then 41 queue it up. 42 43Option 1 is challenging because D-Bus interfaces provided by OpenBMC state 44applications have a mix of read-only properties (like current state) and 45writeable properties that are used to request state changes. Not showing any 46until everything is available could have unknown consequences. This also has 47similar issues to option 2 in that applications and clients must have proper 48code to handle missing interfaces. 49 50Option 2 is challenging because Redfish clients and internal applications like 51the op-panel code now need to properly handle error codes like this. You can 52argue that they already should, but that is definitely not the case with a lot 53of OpenBMC applications and clients. 54 55Option 3 is the most user friendly option. No client or OpenBMC application 56changes would be needed. One concern is that having a system somewhat randomly 57power on at some later point in time could be unexpected. The general consensus 58in this review though has been that this is the most preferred option. 59 60[1]: https://lists.ozlabs.org/pipermail/openbmc/2022-April/030175.html 61[2]: 62 https://github.com/openbmc/phosphor-dbus-interfaces/tree/master/yaml/xyz/openbmc_project/State 63[3]: 64 https://github.com/openbmc/phosphor-dbus-interfaces/blob/master/yaml/xyz/openbmc_project/State/BMC.interface.yaml 65 66## Requirements 67 68- Queue up chassis and host requested state changes until the BMC is in the 69 proper state to allow the request 70 - What the "proper state" is will be implementation specific but by default 71 phosphor-state-manager will queue all requests until the BMC state has 72 reached `Ready` 73 74## Proposed Design 75 76If a power on or boot request is made to the Chassis or Host state objects and 77the BMC is not at `Ready` then the request will be queued and the state 78management code will begin monitoring for BMC `Ready`. Once reached, the 79requested operation will be automatically executed. 80 81## Alternatives Considered 82 83The "Background and References" section covered some alternative options and the 84complexity behind them. 85 86Another option is to code the dependencies directly into the services. For 87example, if the power on service requires the vrm vpd collection service, encode 88that dependency in the systemd files. This is easy to say but in practice has 89been challenging. Some OpenBMC services have built in assumptions that the 90multi-user.target and all of it's dependent services have completed prior to a 91power on being started. The general consensus within IBM was that it's much 92easier and safer to just have a global wait-for-bmc-ready function as proposed 93in this design. 94 95## Impacts 96 97Users will need to understand that their request to power on the system may be 98delayed by an undefined amount of time. In general, a BMC gets to Ready state 99within a couple of minutes. 100 101### Organizational 102 103This function will be implemented within the existing phosphor-state-manager 104repository. x86-power-control, an alternative to phosphor-state-manager, could 105also implement this logic. 106 107## Testing 108 109- Ensure a power on request is properly queued and executed when it is made 110 prior to the BMC being `Ready`. 111