1# BMC Boot Ready 2 3Author: Andrew Geissler (geissonator) 4 5Other contributors: 6 7Created: May 12, 2022 8 9## Problem Description 10There are services which run on the BMC which are required for the BIOS (host 11firmware) to power on and boot the system. The goal of this design is to 12define a mechanism to ensure these dependencies are met before a power on 13or boot is started. 14 15For example, on some system, you can not power on the chassis until the VPD 16has been collected from the VRM's by the BMC to determine their characteristics. 17On other systems, the BIOS service is needed so the host firmware can look 18for any overrides. 19 20Currently, OpenBMC has an undefined behavior in this area. If a particular 21BMC has a large time gap between when the webserver is available and when all 22BMC services have completed running, there is a window there that a user could 23request a power on via the webserver when not all needed services are running. 24 25## Background and References 26 27The mailing list discussion can be found [here][1]. The BMC currently has 28three major [state][2] management interfaces in a system. The BMC, Chassis, and 29Host. Within each state interface, the current state and requested state are 30tracked. 31 32The [BMC][3] state object is considered `Ready` once the systemd 33`multi-user.target` has successfully started all if its services. 34 35There are three options that have been discussed to solve this issue: 361. D-Bus objects don't exist until the backend is prepared to handle them. 372. If a user tries something that system is not in proper state to handle then 38 return an error code. 393. If a user tries something that system is not in proper state to handle then 40 queue it up. 41 42Option 1 is challenging because D-Bus interfaces provided by OpenBMC state 43applications have a mix of read-only properties (like current state) and 44writeable properties that are used to request state changes. Not showing any 45until everything is available could have unknown consequences. This also has 46similar issues to option 2 in that applications and clients must have proper 47code to handle missing interfaces. 48 49Option 2 is challenging because Redfish clients and internal applications like 50the op-panel code now need to properly handle error codes like this. You can 51argue that they already should, but that is definitely not the case with a lot 52of OpenBMC applications and clients. 53 54Option 3 is the most user friendly option. No client or OpenBMC application 55changes would be needed. One concern is that having a system somewhat randomly 56power on at some later point in time could be unexpected. The general consensus 57in this review though has been that this is the most preferred option. 58 59[1]: https://lists.ozlabs.org/pipermail/openbmc/2022-April/030175.html 60[2]: https://github.com/openbmc/phosphor-dbus-interfaces/tree/master/yaml/xyz/openbmc_project/State 61[3]: https://github.com/openbmc/phosphor-dbus-interfaces/blob/master/yaml/xyz/openbmc_project/State/BMC.interface.yaml 62 63## Requirements 64 65- Queue up chassis and host requested state changes until the BMC is in the 66 proper state to allow the request 67 - What the "proper state" is will be implementation specific but by default 68 phosphor-state-manager will queue all requests until the BMC state has 69 reached `Ready` 70 71## Proposed Design 72 73If a power on or boot request is made to the Chassis or Host state objects and 74the BMC is not at `Ready` then the request will be queued and the state 75management code will begin monitoring for BMC `Ready`. Once reached, the 76requested operation will be automatically executed. 77 78## Alternatives Considered 79The "Background and References" section covered some alternative options 80and the complexity behind them. 81 82Another option is to code the dependencies directly into the services. For 83example, if the power on service requires the vrm vpd collection service, 84encode that dependency in the systemd files. This is easy to say but in practice 85has been challenging. Some OpenBMC services have built in assumptions that 86the multi-user.target and all of it's dependent services have completed prior 87to a power on being started. The general consensus within IBM was that it's 88much easier and safer to just have a global wait-for-bmc-ready function as 89proposed in this design. 90 91## Impacts 92 93Users will need to understand that their request to power on the system may 94be delayed by an undefined amount of time. In general, a BMC gets to Ready state 95within a couple of minutes. 96 97### Organizational 98This function will be implemented within the existing phosphor-state-manager 99repository. x86-power-control, an alternative to phosphor-state-manager, could 100also implement this logic. 101 102## Testing 103- Ensure a power on request is properly queued and executed when it is made 104 prior to the BMC being `Ready`. 105