1917037f1SAndrew Geissler# BMC Reset with Host Booted
2917037f1SAndrew Geissler
3917037f1SAndrew GeisslerAuthor: Andrew Geissler (geissonator)
4917037f1SAndrew Geissler
5917037f1SAndrew GeisslerOther contributors:
6917037f1SAndrew Geissler
7917037f1SAndrew GeisslerCreated: June 23rd, 2020
8917037f1SAndrew Geissler
9917037f1SAndrew Geissler## Problem Description
10*f4febd00SPatrick Williams
11917037f1SAndrew GeisslerBMCs can reboot for a lot of different reasons. User request, firmware update,
12917037f1SAndrew Geisslerand a variety of different error scenarios. When the BMC is rebooted while the
13917037f1SAndrew Geisslerhost is up and running, there needs to be a process by which the two synchronize
14917037f1SAndrew Geisslerwith each other and the BMC gets itself into a state that matches with the host.
15917037f1SAndrew Geissler
16917037f1SAndrew Geissler## Background and References
17*f4febd00SPatrick Williams
18917037f1SAndrew GeisslerA good portion of this is explained in the phosphor-state-manager [README][1].
19917037f1SAndrew Geissler
20*f4febd00SPatrick WilliamsThis design doc is written to formalize the design and add some more details on
21*f4febd00SPatrick Williamsdealing with both IPMI and PLDM communication to the host as well as desired
22*f4febd00SPatrick Williamsbehavior when unable to talk with the host.
23917037f1SAndrew Geissler
24917037f1SAndrew GeisslerThe high level flow is that OpenBMC software first checks for pgood, and if set,
25917037f1SAndrew Geisslerit then checks the host to see if it is up and alive. If the power is on and the
26917037f1SAndrew Geisslerhost is running, then files are created in the filesystem to indicate this:
27*f4febd00SPatrick Williams
28917037f1SAndrew Geissler- /run/openbmc/chassis@0-on
29917037f1SAndrew Geissler- /run/openbmc/host@0-on
30917037f1SAndrew Geissler
31917037f1SAndrew GeisslerIt should be noted that although full support is not in place for multi-chassis
32917037f1SAndrew Geisslerand multi-host systems, the framework is there to build on.
33917037f1SAndrew Geissler`op-reset-chassis-running@.service` is a templated service, checking pgood in
34917037f1SAndrew Geisslerits instances power domain. It creates a file in the filesystem,
35917037f1SAndrew Geissler/run/openbmc/chassis@%i-on, to indicate power is on for that instance. Similar
36917037f1SAndrew Geisslerimplementation is done for the host via `phosphor-reset-host-check@.service` and
37917037f1SAndrew Geisslerthe file /run/openbmc/host@%i-on.
38917037f1SAndrew Geissler
39*f4febd00SPatrick WilliamsIf chassis power is on and the host is up, then `obmc-chassis-poweron@.target`
40*f4febd00SPatrick Williamsand `obmc-host-start@.target` are started.
41917037f1SAndrew Geissler
42*f4febd00SPatrick WilliamsThe /run/ files are used by OpenBMC services to determine if they need to run or
43*f4febd00SPatrick Williamsnot when the chassis and host targets are started. For example, if the chassis
44*f4febd00SPatrick Williamsis already powered on and the host is running, there is no need to actually turn
45*f4febd00SPatrick Williamspower on, or start the host. The behavior wanted is that these services "start"
46*f4febd00SPatrick Williamsbut do nothing. That is commonly done within a systemd service file via the
47*f4febd00SPatrick Williamsfollowing:
48*f4febd00SPatrick Williams
49917037f1SAndrew Geissler- `ConditionPathExists=!/run/openbmc/chassis@%i-on`
50917037f1SAndrew Geissler- `ConditionPathExists=!/run/openbmc/host@%i-on`
51917037f1SAndrew Geissler
52*f4febd00SPatrick WilliamsThis allows the targets to start and for the BMC to get in synch with the state
53*f4febd00SPatrick Williamsof the chassis and host without any special software checks.
54917037f1SAndrew Geissler
55917037f1SAndrew GeisslerDifferent systems have different requirements on what the behavior should be
56917037f1SAndrew Geisslerwhen the chassis power is on, but the host is unreachable. This design needs to
57917037f1SAndrew Geisslerallow this customization. For example, some systems would prefer to just leave
58917037f1SAndrew Geisslerthe system in whatever state it is in and let the user correct things. Some
59*f4febd00SPatrick Williamssystems want to recover automatically (i.e. reboot the host) for the user. Some
60*f4febd00SPatrick Williamssystems have a hybrid approach where depending on where the host was in its
61*f4febd00SPatrick Williamsboot, the BMC may leave it or recover.
62917037f1SAndrew Geissler
63917037f1SAndrew Geissler## Requirements
64*f4febd00SPatrick Williams
65917037f1SAndrew Geissler- Support both IPMI and PLDM as mechanisms to determine if the host is running.
66917037f1SAndrew Geissler  - Allow either or both to be enabled
67917037f1SAndrew Geissler- Support custom behavior when chassis power is on but the BMC is unable to
68917037f1SAndrew Geissler  communicate with the host
69*f4febd00SPatrick Williams- Both IPMI and PLDM stacks will give the host a set amount of time to respond.
70*f4febd00SPatrick Williams  Lack of response within this time limit will result in the BMC potentially
71*f4febd00SPatrick Williams  taking recovery actions.
72917037f1SAndrew Geissler  - This time limit must be configurable at build time
73d101b076SAndrew Geissler- IPMI and PLDM will implement a phosphor-dbus-interface interface,
74d101b076SAndrew Geissler  `xyz.openbmc_project.Condition.HostFirmware`, which will have a
75d101b076SAndrew Geissler  `CurrentFirmwareCondition` property which other applications can read to
76d101b076SAndrew Geissler  determine if the host is running.
77917037f1SAndrew Geissler
78917037f1SAndrew Geissler### IPMI Detailed Requirements
79*f4febd00SPatrick Williams
80917037f1SAndrew Geissler- IPMI will continue to utilize the SMS_ATN command to indicate to the host that
81917037f1SAndrew Geissler  a "Get Message Flags Command" is requested. Upon the host issuing that
82917037f1SAndrew Geissler  command, it will be considered up and running
83917037f1SAndrew Geissler
84917037f1SAndrew Geissler### PLDM Detailed Requirements
85*f4febd00SPatrick Williams
86d101b076SAndrew Geissler- PLDM will utilize a GetTID command to the host to determine if it is running
87917037f1SAndrew Geissler- Where applicable, PLDM will provide a mechanism to distinguish between
88917037f1SAndrew Geissler  different host firmware stacks
89*f4febd00SPatrick Williams  - For example, on IBM systems there is a difference between the hostboot (host
90*f4febd00SPatrick Williams    initialization) firmware and Hypervisor firmware stacks. Both are host
91*f4febd00SPatrick Williams    firmware and talking PLDM but the BMC recovery paths will differ based on
92*f4febd00SPatrick Williams    which is running. The `CurrentFirmwareCondition` property should not return
93*f4febd00SPatrick Williams    "Running" unless the Hypervisor firmware is running.
94917037f1SAndrew Geissler
95917037f1SAndrew Geissler## Proposed Design
96*f4febd00SPatrick Williams
97917037f1SAndrew GeisslerHigh Level Flow:
98*f4febd00SPatrick Williams
99917037f1SAndrew Geissler- Check pgood
100d101b076SAndrew Geissler- Call mapper for all implementations of
101d101b076SAndrew Geissler  `xyz.openbmc_project.Condition.HostFirmware` PDI interface
102d101b076SAndrew Geissler- Read `CurrentFirmwareCondition` property of all interface. If any call returns
103d101b076SAndrew Geissler  that a host is running then create file and start host target.
104d101b076SAndrew Geissler- Otherwise, check host via any custom mechanisms
105917037f1SAndrew Geissler- Execute automated recovery of host if desired
106917037f1SAndrew Geissler
107917037f1SAndrew GeisslerIPMI and PLDM software will be started as applicable. A combination of systemd
108917037f1SAndrew Geisslerservices and applications within phosphor-state-manager will coordinate the
109917037f1SAndrew Geisslerchecking of pgood, and if set, request the IPMI and PLDM applications to
110*f4febd00SPatrick Williamsdiscover if the host is running. Based on the response from these queries the
111*f4febd00SPatrick Williamssoftware in phosphor-state-manager will take the appropriate action of creating
112*f4febd00SPatrick Williamsthe /run files and starting the chassis and host targets or entering into
113*f4febd00SPatrick Williamsrecovery of the host.
114917037f1SAndrew Geissler
115917037f1SAndrew GeisslerThe systemd targets responsible for this and any common services will be hosted
116*f4febd00SPatrick Williamswithin phosphor-state-manager. Any system or company specific services can be
117*f4febd00SPatrick Williamsinstalled in the common targets:
118*f4febd00SPatrick Williams
119917037f1SAndrew Geissler- obmc-chassis-powerreset@.target.require
120917037f1SAndrew Geissler- obmc-host-reset@.target.requires
121917037f1SAndrew Geissler
122d5a79731SAndrew Geissler### Automated Recovery when host does not respond
123d5a79731SAndrew Geissler
124d5a79731SAndrew GeisslerA separate service and application will be created within phosphor-state-manager
125*f4febd00SPatrick Williamsto execute the following logic in situations where chassis power is on but the
126*f4febd00SPatrick Williamshost has failed to respond to any of the different mechanisms to communicate
127*f4febd00SPatrick Williamswith it:
128*f4febd00SPatrick Williams
129d5a79731SAndrew Geissler- If chassis power on (/run/openbmc/chassis@%i-on)
130d5a79731SAndrew Geissler- And host is off (!ConditionPathExists=!/run/openbmc/host@%i-on)
131d5a79731SAndrew Geissler- And restored BootProgress is not None
132d5a79731SAndrew Geissler- Then (assume host was booting before BMC reboot)
133d5a79731SAndrew Geissler  - Log error indicating situation
134d5a79731SAndrew Geissler  - Move host to Quiesce and allow automated recovery to kick in
135d5a79731SAndrew Geissler
136917037f1SAndrew Geissler### Note on custom mechanism for IBM systems
137*f4febd00SPatrick Williams
138917037f1SAndrew GeisslerIBM systems will utilize a processor CFAM register. The specific register is
139917037f1SAndrew Geissler**Mailbox scratch register 12**.
140917037f1SAndrew Geissler
141*f4febd00SPatrick WilliamsIf the chassis power is on but the BMC is unable to communicate with the host
142*f4febd00SPatrick Williamsvia IPMI or PLDM, then the BMC will read this processor CFAM register.
143917037f1SAndrew Geissler
144917037f1SAndrew GeisslerThe Host code will write `0xA5000001` to this register to indicate when it has
145917037f1SAndrew Geisslerreached a state in which it can boot an operating system without needing the
146917037f1SAndrew GeisslerBMC. If the BMC sees this value written in the CFAM register, then it will leave
147917037f1SAndrew Geisslerthe host as-is, even if it is unable to communicate with the host over IPMI or
148917037f1SAndrew GeisslerPLDM. It will log an error indicating it was unable to communicate with the host
149917037f1SAndrew Geisslerbut it will also show the host state as `Running`.
150917037f1SAndrew Geissler
151917037f1SAndrew GeisslerIf the register is not `0xA5000001`, then the BMC will follow whatever recovery
152*f4febd00SPatrick Williamsmechanisms are defined for when the host has a failure (most likely a reboot of
153*f4febd00SPatrick Williamsthe host).
154917037f1SAndrew Geissler
155*f4febd00SPatrick WilliamsIt is the responsibility of the host firmware to set this register as applicable
156*f4febd00SPatrick Williamsduring the boot of the system. Host firmware will clear this register in
157*f4febd00SPatrick Williamsshutdown scenarios. To handle different host crash scenarios, the register will
158*f4febd00SPatrick Williamsalso be cleared by the BMC firmware on power off's, system checkstops, and
159917037f1SAndrew Geisslerduring Memory Preserving reboots.
160917037f1SAndrew Geissler
161917037f1SAndrew Geissler## Alternatives Considered
162*f4febd00SPatrick Williams
163917037f1SAndrew GeisslerOne thought was to avoid IPMI/PLDM all together and only use a "lowest common
164917037f1SAndrew Geisslerdenominator" hardware register of some sort. The problem with that is you start
165917037f1SAndrew Geisslercreating your own protocol, and before you know it you have something like IPMI
166*f4febd00SPatrick Williamsor PLDM anyway, except you created something custom. So 99% of the time the IPMI
167*f4febd00SPatrick Williamsor PLDM path will be fine, and as a backup option, system owners can put their
168*f4febd00SPatrick Williamsown custom host-detection applications in.
169917037f1SAndrew Geissler
170917037f1SAndrew Geissler## Impacts
171*f4febd00SPatrick Williams
172917037f1SAndrew GeisslerNone
173917037f1SAndrew Geissler
174917037f1SAndrew Geissler## Testing
175*f4febd00SPatrick Williams
176917037f1SAndrew GeisslerThe normal path of IPMI and PLDM will be simple to test. Boot your system and
177917037f1SAndrew Geisslerreboot the BMC. Verify the BMC chassis and host states are correct and verify
178917037f1SAndrew Geisslerthe host continued to run without issue throughout the BMC reset.
179917037f1SAndrew Geissler
180917037f1SAndrew GeisslerThe more complicated tests will be error paths:
181*f4febd00SPatrick Williams
182*f4febd00SPatrick Williams- Reboot the BMC while the host is booting, but before it's in a state where it
183*f4febd00SPatrick Williams  can continue to run without the BMC. Verify the BMC detects this error
184917037f1SAndrew Geissler  scenario once it comes back from its reboot and takes the proper recovery
185917037f1SAndrew Geissler  actions for the host.
186917037f1SAndrew Geissler- Reboot the BMC when the host is up and running but disable the IPMI/PLDM stack
187917037f1SAndrew Geissler  on the host so it does not respond to the BMC when it comes back from its
188917037f1SAndrew Geissler  reboot. Ensure the BMC follows the defined recovery for the system in this
189917037f1SAndrew Geissler  situation.
190917037f1SAndrew Geissler
191*f4febd00SPatrick Williams[1]:
192*f4febd00SPatrick Williams  https://github.com/openbmc/phosphor-state-manager#bmc-reset-with-host-andor-chassis-on
193