1# BMC Reset with Host Booted 2 3Author: Andrew Geissler (geissonator) 4 5Other contributors: 6 7Created: June 23rd, 2020 8 9## Problem Description 10 11BMCs can reboot for a lot of different reasons. User request, firmware update, 12and a variety of different error scenarios. When the BMC is rebooted while the 13host is up and running, there needs to be a process by which the two synchronize 14with each other and the BMC gets itself into a state that matches with the host. 15 16## Background and References 17 18A good portion of this is explained in the phosphor-state-manager [README][1]. 19 20This design doc is written to formalize the design and add some more details on 21dealing with both IPMI and PLDM communication to the host as well as desired 22behavior when unable to talk with the host. 23 24The high level flow is that OpenBMC software first checks for pgood, and if set, 25it then checks the host to see if it is up and alive. If the power is on and the 26host is running, then files are created in the filesystem to indicate this: 27 28- /run/openbmc/chassis@0-on 29- /run/openbmc/host@0-on 30 31It should be noted that although full support is not in place for multi-chassis 32and multi-host systems, the framework is there to build on. 33`op-reset-chassis-running@.service` is a templated service, checking pgood in 34its instances power domain. It creates a file in the filesystem, 35/run/openbmc/chassis@%i-on, to indicate power is on for that instance. Similar 36implementation is done for the host via `phosphor-reset-host-check@.service` and 37the file /run/openbmc/host@%i-on. 38 39If chassis power is on and the host is up, then `obmc-chassis-poweron@.target` 40and `obmc-host-start@.target` are started. 41 42The /run/ files are used by OpenBMC services to determine if they need to run or 43not when the chassis and host targets are started. For example, if the chassis 44is already powered on and the host is running, there is no need to actually turn 45power on, or start the host. The behavior wanted is that these services "start" 46but do nothing. That is commonly done within a systemd service file via the 47following: 48 49- `ConditionPathExists=!/run/openbmc/chassis@%i-on` 50- `ConditionPathExists=!/run/openbmc/host@%i-on` 51 52This allows the targets to start and for the BMC to get in synch with the state 53of the chassis and host without any special software checks. 54 55Different systems have different requirements on what the behavior should be 56when the chassis power is on, but the host is unreachable. This design needs to 57allow this customization. For example, some systems would prefer to just leave 58the system in whatever state it is in and let the user correct things. Some 59systems want to recover automatically (i.e. reboot the host) for the user. Some 60systems have a hybrid approach where depending on where the host was in its 61boot, the BMC may leave it or recover. 62 63## Requirements 64 65- Support both IPMI and PLDM as mechanisms to determine if the host is running. 66 - Allow either or both to be enabled 67- Support custom behavior when chassis power is on but the BMC is unable to 68 communicate with the host 69- Both IPMI and PLDM stacks will give the host a set amount of time to respond. 70 Lack of response within this time limit will result in the BMC potentially 71 taking recovery actions. 72 - This time limit must be configurable at build time 73- IPMI and PLDM will implement a phosphor-dbus-interface interface, 74 `xyz.openbmc_project.Condition.HostFirmware`, which will have a 75 `CurrentFirmwareCondition` property which other applications can read to 76 determine if the host is running. 77 78### IPMI Detailed Requirements 79 80- IPMI will continue to utilize the SMS_ATN command to indicate to the host that 81 a "Get Message Flags Command" is requested. Upon the host issuing that 82 command, it will be considered up and running 83 84### PLDM Detailed Requirements 85 86- PLDM will utilize a GetTID command to the host to determine if it is running 87- Where applicable, PLDM will provide a mechanism to distinguish between 88 different host firmware stacks 89 - For example, on IBM systems there is a difference between the hostboot (host 90 initialization) firmware and Hypervisor firmware stacks. Both are host 91 firmware and talking PLDM but the BMC recovery paths will differ based on 92 which is running. The `CurrentFirmwareCondition` property should not return 93 "Running" unless the Hypervisor firmware is running. 94 95## Proposed Design 96 97High Level Flow: 98 99- Check pgood 100- Call mapper for all implementations of 101 `xyz.openbmc_project.Condition.HostFirmware` PDI interface 102- Read `CurrentFirmwareCondition` property of all interface. If any call returns 103 that a host is running then create file and start host target. 104- Otherwise, check host via any custom mechanisms 105- Execute automated recovery of host if desired 106 107IPMI and PLDM software will be started as applicable. A combination of systemd 108services and applications within phosphor-state-manager will coordinate the 109checking of pgood, and if set, request the IPMI and PLDM applications to 110discover if the host is running. Based on the response from these queries the 111software in phosphor-state-manager will take the appropriate action of creating 112the /run files and starting the chassis and host targets or entering into 113recovery of the host. 114 115The systemd targets responsible for this and any common services will be hosted 116within phosphor-state-manager. Any system or company specific services can be 117installed in the common targets: 118 119- obmc-chassis-powerreset@.target.require 120- obmc-host-reset@.target.requires 121 122### Automated Recovery when host does not respond 123 124A separate service and application will be created within phosphor-state-manager 125to execute the following logic in situations where chassis power is on but the 126host has failed to respond to any of the different mechanisms to communicate 127with it: 128 129- If chassis power on (/run/openbmc/chassis@%i-on) 130- And host is off (!ConditionPathExists=!/run/openbmc/host@%i-on) 131- And restored BootProgress is not None 132- Then (assume host was booting before BMC reboot) 133 - Log error indicating situation 134 - Move host to Quiesce and allow automated recovery to kick in 135 136### Note on custom mechanism for IBM systems 137 138IBM systems will utilize a processor CFAM register. The specific register is 139**Mailbox scratch register 12**. 140 141If the chassis power is on but the BMC is unable to communicate with the host 142via IPMI or PLDM, then the BMC will read this processor CFAM register. 143 144The Host code will write `0xA5000001` to this register to indicate when it has 145reached a state in which it can boot an operating system without needing the 146BMC. If the BMC sees this value written in the CFAM register, then it will leave 147the host as-is, even if it is unable to communicate with the host over IPMI or 148PLDM. It will log an error indicating it was unable to communicate with the host 149but it will also show the host state as `Running`. 150 151If the register is not `0xA5000001`, then the BMC will follow whatever recovery 152mechanisms are defined for when the host has a failure (most likely a reboot of 153the host). 154 155It is the responsibility of the host firmware to set this register as applicable 156during the boot of the system. Host firmware will clear this register in 157shutdown scenarios. To handle different host crash scenarios, the register will 158also be cleared by the BMC firmware on power off's, system checkstops, and 159during Memory Preserving reboots. 160 161## Alternatives Considered 162 163One thought was to avoid IPMI/PLDM all together and only use a "lowest common 164denominator" hardware register of some sort. The problem with that is you start 165creating your own protocol, and before you know it you have something like IPMI 166or PLDM anyway, except you created something custom. So 99% of the time the IPMI 167or PLDM path will be fine, and as a backup option, system owners can put their 168own custom host-detection applications in. 169 170## Impacts 171 172None 173 174## Testing 175 176The normal path of IPMI and PLDM will be simple to test. Boot your system and 177reboot the BMC. Verify the BMC chassis and host states are correct and verify 178the host continued to run without issue throughout the BMC reset. 179 180The more complicated tests will be error paths: 181 182- Reboot the BMC while the host is booting, but before it's in a state where it 183 can continue to run without the BMC. Verify the BMC detects this error 184 scenario once it comes back from its reboot and takes the proper recovery 185 actions for the host. 186- Reboot the BMC when the host is up and running but disable the IPMI/PLDM stack 187 on the host so it does not respond to the BMC when it comes back from its 188 reboot. Ensure the BMC follows the defined recovery for the system in this 189 situation. 190 191[1]: 192 https://github.com/openbmc/phosphor-state-manager#bmc-reset-with-host-andor-chassis-on 193