1*917037f1SAndrew Geissler# BMC Reset with Host Booted
2*917037f1SAndrew Geissler
3*917037f1SAndrew GeisslerAuthor: Andrew Geissler (geissonator)
4*917037f1SAndrew Geissler
5*917037f1SAndrew GeisslerPrimary assignee: Andrew Geissler (geissonator)
6*917037f1SAndrew Geissler
7*917037f1SAndrew GeisslerOther contributors:
8*917037f1SAndrew Geissler
9*917037f1SAndrew GeisslerCreated: June 23rd, 2020
10*917037f1SAndrew Geissler
11*917037f1SAndrew Geissler## Problem Description
12*917037f1SAndrew GeisslerBMCs can reboot for a lot of different reasons. User request, firmware update,
13*917037f1SAndrew Geisslerand a variety of different error scenarios. When the BMC is rebooted while the
14*917037f1SAndrew Geisslerhost is up and running, there needs to be a process by which the two synchronize
15*917037f1SAndrew Geisslerwith each other and the BMC gets itself into a state that matches with the host.
16*917037f1SAndrew Geissler
17*917037f1SAndrew Geissler## Background and References
18*917037f1SAndrew GeisslerA good portion of this is explained in the phosphor-state-manager [README][1].
19*917037f1SAndrew Geissler
20*917037f1SAndrew GeisslerThis design doc is written to formalize the design and add some more
21*917037f1SAndrew Geisslerdetails on dealing with both IPMI and PLDM communication to the host as well as
22*917037f1SAndrew Geisslerdesired behavior when unable to talk with the host.
23*917037f1SAndrew Geissler
24*917037f1SAndrew GeisslerThe high level flow is that OpenBMC software first checks for pgood, and if set,
25*917037f1SAndrew Geisslerit then checks the host to see if it is up and alive. If the power is on and the
26*917037f1SAndrew Geisslerhost is running, then files are created in the filesystem to indicate this:
27*917037f1SAndrew Geissler- /run/openbmc/chassis@0-on
28*917037f1SAndrew Geissler- /run/openbmc/host@0-on
29*917037f1SAndrew Geissler
30*917037f1SAndrew GeisslerIt should be noted that although full support is not in place for multi-chassis
31*917037f1SAndrew Geisslerand multi-host systems, the framework is there to build on.
32*917037f1SAndrew Geissler`op-reset-chassis-running@.service` is a templated service, checking pgood in
33*917037f1SAndrew Geisslerits instances power domain. It creates a file in the filesystem,
34*917037f1SAndrew Geissler/run/openbmc/chassis@%i-on, to indicate power is on for that instance. Similar
35*917037f1SAndrew Geisslerimplementation is done for the host via `phosphor-reset-host-check@.service` and
36*917037f1SAndrew Geisslerthe file /run/openbmc/host@%i-on.
37*917037f1SAndrew Geissler
38*917037f1SAndrew GeisslerIf chassis power is on and the host is up, then
39*917037f1SAndrew Geissler`obmc-chassis-poweron@.target` and `obmc-host-start@.target` are started.
40*917037f1SAndrew Geissler
41*917037f1SAndrew GeisslerThe /run/ files are used by OpenBMC services to determine if they need to run
42*917037f1SAndrew Geissleror not when the chassis and host targets are started. For example, if the
43*917037f1SAndrew Geisslerchassis is already powered on and the host is running, there is no need to
44*917037f1SAndrew Geissleractually turn power on, or start the host. The behavior wanted is that these
45*917037f1SAndrew Geisslerservices "start" but do nothing. That is commonly done within a systemd service
46*917037f1SAndrew Geisslerfile via the following:
47*917037f1SAndrew Geissler- `ConditionPathExists=!/run/openbmc/chassis@%i-on`
48*917037f1SAndrew Geissler- `ConditionPathExists=!/run/openbmc/host@%i-on`
49*917037f1SAndrew Geissler
50*917037f1SAndrew GeisslerThis allows the targets to start and for the BMC to get in synch with the
51*917037f1SAndrew Geisslerstate of the chassis and host without any special software checks.
52*917037f1SAndrew Geissler
53*917037f1SAndrew GeisslerDifferent systems have different requirements on what the behavior should be
54*917037f1SAndrew Geisslerwhen the chassis power is on, but the host is unreachable. This design needs to
55*917037f1SAndrew Geisslerallow this customization. For example, some systems would prefer to just leave
56*917037f1SAndrew Geisslerthe system in whatever state it is in and let the user correct things. Some
57*917037f1SAndrew Geisslersystems want to recover automatically (i.e. reboot the host) for the user.
58*917037f1SAndrew GeisslerSome systems have a hybrid approach where depending on where the host was in
59*917037f1SAndrew Geisslerits boot, the BMC may leave it or recover.
60*917037f1SAndrew Geissler
61*917037f1SAndrew Geissler## Requirements
62*917037f1SAndrew Geissler- Support both IPMI and PLDM as mechanisms to determine if the host is running.
63*917037f1SAndrew Geissler  - Allow either or both to be enabled
64*917037f1SAndrew Geissler- Support custom behavior when chassis power is on but the BMC is unable to
65*917037f1SAndrew Geissler  communicate with the host
66*917037f1SAndrew Geissler- Both IPMI and PLDM stacks will give the host a set amount of time to
67*917037f1SAndrew Geissler  respond. Lack of response within this time limit will result in the BMC
68*917037f1SAndrew Geissler  potentially taking recovery actions.
69*917037f1SAndrew Geissler  - This time limit must be configurable at build time
70*917037f1SAndrew Geissler
71*917037f1SAndrew Geissler### IPMI Detailed Requirements
72*917037f1SAndrew Geissler- IPMI will continue to utilize the SMS_ATN command to indicate to the host that
73*917037f1SAndrew Geissler  a "Get Message Flags Command" is requested. Upon the host issuing that
74*917037f1SAndrew Geissler  command, it will be considered up and running
75*917037f1SAndrew Geissler
76*917037f1SAndrew Geissler### PLDM Detailed Requirements
77*917037f1SAndrew Geissler- PLDM will do a PDR exchange with the host to regenerate the BMC PDR's
78*917037f1SAndrew Geissler- Where applicable, PLDM will provide a mechanism to distinguish between
79*917037f1SAndrew Geissler  different host firmware stacks
80*917037f1SAndrew Geissler  - For example, on IBM systems there is a difference between the
81*917037f1SAndrew Geissler    hostboot (host initialization) firmware and Hypervisor firmware stacks.
82*917037f1SAndrew Geissler    Both are host firmware and talking PLDM but the BMC recovery paths will
83*917037f1SAndrew Geissler    differ based on which is running
84*917037f1SAndrew Geissler
85*917037f1SAndrew Geissler## Proposed Design
86*917037f1SAndrew GeisslerHigh Level Flow:
87*917037f1SAndrew Geissler- Check pgood
88*917037f1SAndrew Geissler- Check host via IPMI (enabled via compile option)
89*917037f1SAndrew Geissler- Check host via PLDM (enabled via compile option)
90*917037f1SAndrew Geissler- Check host via any custom mechanisms
91*917037f1SAndrew Geissler- Execute automated recovery of host if desired
92*917037f1SAndrew Geissler
93*917037f1SAndrew GeisslerIPMI and PLDM software will be started as applicable. A combination of systemd
94*917037f1SAndrew Geisslerservices and applications within phosphor-state-manager will coordinate the
95*917037f1SAndrew Geisslerchecking of pgood, and if set, request the IPMI and PLDM applications to
96*917037f1SAndrew Geisslerdiscover if the host is running. Based on the response from these queries
97*917037f1SAndrew Geisslerthe software in phosphor-state-manager will take the appropriate action of
98*917037f1SAndrew Geisslercreating the /run files and starting the chassis and host targets or entering
99*917037f1SAndrew Geisslerinto recovery of the host.
100*917037f1SAndrew Geissler
101*917037f1SAndrew GeisslerThe systemd targets responsible for this and any common services will be hosted
102*917037f1SAndrew Geisslerwithin phosphor-state-manager. Any system or company specific services can
103*917037f1SAndrew Geisslerbe installed in the common targets:
104*917037f1SAndrew Geissler- obmc-chassis-powerreset@.target.require
105*917037f1SAndrew Geissler- obmc-host-reset@.target.requires
106*917037f1SAndrew Geissler
107*917037f1SAndrew Geissler### Note on custom mechanism for IBM systems
108*917037f1SAndrew GeisslerIBM systems will utilize a processor CFAM register. The specific register is
109*917037f1SAndrew Geissler**Mailbox scratch register 12**.
110*917037f1SAndrew Geissler
111*917037f1SAndrew GeisslerIf the chassis power is on but the BMC is unable to communicate with the
112*917037f1SAndrew Geisslerhost via IPMI or PLDM, then the BMC will read this processor CFAM register.
113*917037f1SAndrew Geissler
114*917037f1SAndrew GeisslerThe Host code will write `0xA5000001` to this register to indicate when it has
115*917037f1SAndrew Geisslerreached a state in which it can boot an operating system without needing the
116*917037f1SAndrew GeisslerBMC. If the BMC sees this value written in the CFAM register, then it will leave
117*917037f1SAndrew Geisslerthe host as-is, even if it is unable to communicate with the host over IPMI or
118*917037f1SAndrew GeisslerPLDM. It will log an error indicating it was unable to communicate with the host
119*917037f1SAndrew Geisslerbut it will also show the host state as `Running`.
120*917037f1SAndrew Geissler
121*917037f1SAndrew GeisslerIf the register is not `0xA5000001`, then the BMC will follow whatever recovery
122*917037f1SAndrew Geisslermechanisms are defined for when the host has a failure (most likely a reboot
123*917037f1SAndrew Geisslerof the host).
124*917037f1SAndrew Geissler
125*917037f1SAndrew GeisslerIt is the responsibility of the host firmware to set this register as
126*917037f1SAndrew Geisslerapplicable during the boot of the system. Host firmware will clear this register
127*917037f1SAndrew Geisslerin shutdown scenarios. To handle different host crash scenarios, the register
128*917037f1SAndrew Geisslerwill also be cleared by the BMC firmware on power off's, system checkstops, and
129*917037f1SAndrew Geisslerduring Memory Preserving reboots.
130*917037f1SAndrew Geissler
131*917037f1SAndrew Geissler
132*917037f1SAndrew Geissler## Alternatives Considered
133*917037f1SAndrew GeisslerOne thought was to avoid IPMI/PLDM all together and only use a "lowest common
134*917037f1SAndrew Geisslerdenominator" hardware register of some sort. The problem with that is you start
135*917037f1SAndrew Geisslercreating your own protocol, and before you know it you have something like IPMI
136*917037f1SAndrew Geissleror PLDM anyway, except you created something custom. So 99% of the time the
137*917037f1SAndrew GeisslerIPMI or PLDM path will be fine, and as a backup option, system owners can
138*917037f1SAndrew Geisslerput their own custom host-detection applications in.
139*917037f1SAndrew Geissler
140*917037f1SAndrew Geissler## Impacts
141*917037f1SAndrew GeisslerNone
142*917037f1SAndrew Geissler
143*917037f1SAndrew Geissler## Testing
144*917037f1SAndrew GeisslerThe normal path of IPMI and PLDM will be simple to test. Boot your system and
145*917037f1SAndrew Geisslerreboot the BMC. Verify the BMC chassis and host states are correct and verify
146*917037f1SAndrew Geisslerthe host continued to run without issue throughout the BMC reset.
147*917037f1SAndrew Geissler
148*917037f1SAndrew GeisslerThe more complicated tests will be error paths:
149*917037f1SAndrew Geissler- Reboot the BMC while the host is booting, but before it's in a state where
150*917037f1SAndrew Geissler  it can continue to run without the BMC. Verify the BMC detects this error
151*917037f1SAndrew Geissler  scenario once it comes back from its reboot and takes the proper recovery
152*917037f1SAndrew Geissler  actions for the host.
153*917037f1SAndrew Geissler- Reboot the BMC when the host is up and running but disable the IPMI/PLDM stack
154*917037f1SAndrew Geissler  on the host so it does not respond to the BMC when it comes back from its
155*917037f1SAndrew Geissler  reboot. Ensure the BMC follows the defined recovery for the system in this
156*917037f1SAndrew Geissler  situation.
157*917037f1SAndrew Geissler
158*917037f1SAndrew Geissler[1]: https://github.com/openbmc/phosphor-state-manager#bmc-reset-with-host-andor-chassis-on
159