1*917037f1SAndrew Geissler# BMC Reset with Host Booted 2*917037f1SAndrew Geissler 3*917037f1SAndrew GeisslerAuthor: Andrew Geissler (geissonator) 4*917037f1SAndrew Geissler 5*917037f1SAndrew GeisslerPrimary assignee: Andrew Geissler (geissonator) 6*917037f1SAndrew Geissler 7*917037f1SAndrew GeisslerOther contributors: 8*917037f1SAndrew Geissler 9*917037f1SAndrew GeisslerCreated: June 23rd, 2020 10*917037f1SAndrew Geissler 11*917037f1SAndrew Geissler## Problem Description 12*917037f1SAndrew GeisslerBMCs can reboot for a lot of different reasons. User request, firmware update, 13*917037f1SAndrew Geisslerand a variety of different error scenarios. When the BMC is rebooted while the 14*917037f1SAndrew Geisslerhost is up and running, there needs to be a process by which the two synchronize 15*917037f1SAndrew Geisslerwith each other and the BMC gets itself into a state that matches with the host. 16*917037f1SAndrew Geissler 17*917037f1SAndrew Geissler## Background and References 18*917037f1SAndrew GeisslerA good portion of this is explained in the phosphor-state-manager [README][1]. 19*917037f1SAndrew Geissler 20*917037f1SAndrew GeisslerThis design doc is written to formalize the design and add some more 21*917037f1SAndrew Geisslerdetails on dealing with both IPMI and PLDM communication to the host as well as 22*917037f1SAndrew Geisslerdesired behavior when unable to talk with the host. 23*917037f1SAndrew Geissler 24*917037f1SAndrew GeisslerThe high level flow is that OpenBMC software first checks for pgood, and if set, 25*917037f1SAndrew Geisslerit then checks the host to see if it is up and alive. If the power is on and the 26*917037f1SAndrew Geisslerhost is running, then files are created in the filesystem to indicate this: 27*917037f1SAndrew Geissler- /run/openbmc/chassis@0-on 28*917037f1SAndrew Geissler- /run/openbmc/host@0-on 29*917037f1SAndrew Geissler 30*917037f1SAndrew GeisslerIt should be noted that although full support is not in place for multi-chassis 31*917037f1SAndrew Geisslerand multi-host systems, the framework is there to build on. 32*917037f1SAndrew Geissler`op-reset-chassis-running@.service` is a templated service, checking pgood in 33*917037f1SAndrew Geisslerits instances power domain. It creates a file in the filesystem, 34*917037f1SAndrew Geissler/run/openbmc/chassis@%i-on, to indicate power is on for that instance. Similar 35*917037f1SAndrew Geisslerimplementation is done for the host via `phosphor-reset-host-check@.service` and 36*917037f1SAndrew Geisslerthe file /run/openbmc/host@%i-on. 37*917037f1SAndrew Geissler 38*917037f1SAndrew GeisslerIf chassis power is on and the host is up, then 39*917037f1SAndrew Geissler`obmc-chassis-poweron@.target` and `obmc-host-start@.target` are started. 40*917037f1SAndrew Geissler 41*917037f1SAndrew GeisslerThe /run/ files are used by OpenBMC services to determine if they need to run 42*917037f1SAndrew Geissleror not when the chassis and host targets are started. For example, if the 43*917037f1SAndrew Geisslerchassis is already powered on and the host is running, there is no need to 44*917037f1SAndrew Geissleractually turn power on, or start the host. The behavior wanted is that these 45*917037f1SAndrew Geisslerservices "start" but do nothing. That is commonly done within a systemd service 46*917037f1SAndrew Geisslerfile via the following: 47*917037f1SAndrew Geissler- `ConditionPathExists=!/run/openbmc/chassis@%i-on` 48*917037f1SAndrew Geissler- `ConditionPathExists=!/run/openbmc/host@%i-on` 49*917037f1SAndrew Geissler 50*917037f1SAndrew GeisslerThis allows the targets to start and for the BMC to get in synch with the 51*917037f1SAndrew Geisslerstate of the chassis and host without any special software checks. 52*917037f1SAndrew Geissler 53*917037f1SAndrew GeisslerDifferent systems have different requirements on what the behavior should be 54*917037f1SAndrew Geisslerwhen the chassis power is on, but the host is unreachable. This design needs to 55*917037f1SAndrew Geisslerallow this customization. For example, some systems would prefer to just leave 56*917037f1SAndrew Geisslerthe system in whatever state it is in and let the user correct things. Some 57*917037f1SAndrew Geisslersystems want to recover automatically (i.e. reboot the host) for the user. 58*917037f1SAndrew GeisslerSome systems have a hybrid approach where depending on where the host was in 59*917037f1SAndrew Geisslerits boot, the BMC may leave it or recover. 60*917037f1SAndrew Geissler 61*917037f1SAndrew Geissler## Requirements 62*917037f1SAndrew Geissler- Support both IPMI and PLDM as mechanisms to determine if the host is running. 63*917037f1SAndrew Geissler - Allow either or both to be enabled 64*917037f1SAndrew Geissler- Support custom behavior when chassis power is on but the BMC is unable to 65*917037f1SAndrew Geissler communicate with the host 66*917037f1SAndrew Geissler- Both IPMI and PLDM stacks will give the host a set amount of time to 67*917037f1SAndrew Geissler respond. Lack of response within this time limit will result in the BMC 68*917037f1SAndrew Geissler potentially taking recovery actions. 69*917037f1SAndrew Geissler - This time limit must be configurable at build time 70*917037f1SAndrew Geissler 71*917037f1SAndrew Geissler### IPMI Detailed Requirements 72*917037f1SAndrew Geissler- IPMI will continue to utilize the SMS_ATN command to indicate to the host that 73*917037f1SAndrew Geissler a "Get Message Flags Command" is requested. Upon the host issuing that 74*917037f1SAndrew Geissler command, it will be considered up and running 75*917037f1SAndrew Geissler 76*917037f1SAndrew Geissler### PLDM Detailed Requirements 77*917037f1SAndrew Geissler- PLDM will do a PDR exchange with the host to regenerate the BMC PDR's 78*917037f1SAndrew Geissler- Where applicable, PLDM will provide a mechanism to distinguish between 79*917037f1SAndrew Geissler different host firmware stacks 80*917037f1SAndrew Geissler - For example, on IBM systems there is a difference between the 81*917037f1SAndrew Geissler hostboot (host initialization) firmware and Hypervisor firmware stacks. 82*917037f1SAndrew Geissler Both are host firmware and talking PLDM but the BMC recovery paths will 83*917037f1SAndrew Geissler differ based on which is running 84*917037f1SAndrew Geissler 85*917037f1SAndrew Geissler## Proposed Design 86*917037f1SAndrew GeisslerHigh Level Flow: 87*917037f1SAndrew Geissler- Check pgood 88*917037f1SAndrew Geissler- Check host via IPMI (enabled via compile option) 89*917037f1SAndrew Geissler- Check host via PLDM (enabled via compile option) 90*917037f1SAndrew Geissler- Check host via any custom mechanisms 91*917037f1SAndrew Geissler- Execute automated recovery of host if desired 92*917037f1SAndrew Geissler 93*917037f1SAndrew GeisslerIPMI and PLDM software will be started as applicable. A combination of systemd 94*917037f1SAndrew Geisslerservices and applications within phosphor-state-manager will coordinate the 95*917037f1SAndrew Geisslerchecking of pgood, and if set, request the IPMI and PLDM applications to 96*917037f1SAndrew Geisslerdiscover if the host is running. Based on the response from these queries 97*917037f1SAndrew Geisslerthe software in phosphor-state-manager will take the appropriate action of 98*917037f1SAndrew Geisslercreating the /run files and starting the chassis and host targets or entering 99*917037f1SAndrew Geisslerinto recovery of the host. 100*917037f1SAndrew Geissler 101*917037f1SAndrew GeisslerThe systemd targets responsible for this and any common services will be hosted 102*917037f1SAndrew Geisslerwithin phosphor-state-manager. Any system or company specific services can 103*917037f1SAndrew Geisslerbe installed in the common targets: 104*917037f1SAndrew Geissler- obmc-chassis-powerreset@.target.require 105*917037f1SAndrew Geissler- obmc-host-reset@.target.requires 106*917037f1SAndrew Geissler 107*917037f1SAndrew Geissler### Note on custom mechanism for IBM systems 108*917037f1SAndrew GeisslerIBM systems will utilize a processor CFAM register. The specific register is 109*917037f1SAndrew Geissler**Mailbox scratch register 12**. 110*917037f1SAndrew Geissler 111*917037f1SAndrew GeisslerIf the chassis power is on but the BMC is unable to communicate with the 112*917037f1SAndrew Geisslerhost via IPMI or PLDM, then the BMC will read this processor CFAM register. 113*917037f1SAndrew Geissler 114*917037f1SAndrew GeisslerThe Host code will write `0xA5000001` to this register to indicate when it has 115*917037f1SAndrew Geisslerreached a state in which it can boot an operating system without needing the 116*917037f1SAndrew GeisslerBMC. If the BMC sees this value written in the CFAM register, then it will leave 117*917037f1SAndrew Geisslerthe host as-is, even if it is unable to communicate with the host over IPMI or 118*917037f1SAndrew GeisslerPLDM. It will log an error indicating it was unable to communicate with the host 119*917037f1SAndrew Geisslerbut it will also show the host state as `Running`. 120*917037f1SAndrew Geissler 121*917037f1SAndrew GeisslerIf the register is not `0xA5000001`, then the BMC will follow whatever recovery 122*917037f1SAndrew Geisslermechanisms are defined for when the host has a failure (most likely a reboot 123*917037f1SAndrew Geisslerof the host). 124*917037f1SAndrew Geissler 125*917037f1SAndrew GeisslerIt is the responsibility of the host firmware to set this register as 126*917037f1SAndrew Geisslerapplicable during the boot of the system. Host firmware will clear this register 127*917037f1SAndrew Geisslerin shutdown scenarios. To handle different host crash scenarios, the register 128*917037f1SAndrew Geisslerwill also be cleared by the BMC firmware on power off's, system checkstops, and 129*917037f1SAndrew Geisslerduring Memory Preserving reboots. 130*917037f1SAndrew Geissler 131*917037f1SAndrew Geissler 132*917037f1SAndrew Geissler## Alternatives Considered 133*917037f1SAndrew GeisslerOne thought was to avoid IPMI/PLDM all together and only use a "lowest common 134*917037f1SAndrew Geisslerdenominator" hardware register of some sort. The problem with that is you start 135*917037f1SAndrew Geisslercreating your own protocol, and before you know it you have something like IPMI 136*917037f1SAndrew Geissleror PLDM anyway, except you created something custom. So 99% of the time the 137*917037f1SAndrew GeisslerIPMI or PLDM path will be fine, and as a backup option, system owners can 138*917037f1SAndrew Geisslerput their own custom host-detection applications in. 139*917037f1SAndrew Geissler 140*917037f1SAndrew Geissler## Impacts 141*917037f1SAndrew GeisslerNone 142*917037f1SAndrew Geissler 143*917037f1SAndrew Geissler## Testing 144*917037f1SAndrew GeisslerThe normal path of IPMI and PLDM will be simple to test. Boot your system and 145*917037f1SAndrew Geisslerreboot the BMC. Verify the BMC chassis and host states are correct and verify 146*917037f1SAndrew Geisslerthe host continued to run without issue throughout the BMC reset. 147*917037f1SAndrew Geissler 148*917037f1SAndrew GeisslerThe more complicated tests will be error paths: 149*917037f1SAndrew Geissler- Reboot the BMC while the host is booting, but before it's in a state where 150*917037f1SAndrew Geissler it can continue to run without the BMC. Verify the BMC detects this error 151*917037f1SAndrew Geissler scenario once it comes back from its reboot and takes the proper recovery 152*917037f1SAndrew Geissler actions for the host. 153*917037f1SAndrew Geissler- Reboot the BMC when the host is up and running but disable the IPMI/PLDM stack 154*917037f1SAndrew Geissler on the host so it does not respond to the BMC when it comes back from its 155*917037f1SAndrew Geissler reboot. Ensure the BMC follows the defined recovery for the system in this 156*917037f1SAndrew Geissler situation. 157*917037f1SAndrew Geissler 158*917037f1SAndrew Geissler[1]: https://github.com/openbmc/phosphor-state-manager#bmc-reset-with-host-andor-chassis-on 159