1917037f1SAndrew Geissler# BMC Reset with Host Booted
2917037f1SAndrew Geissler
3917037f1SAndrew GeisslerAuthor: Andrew Geissler (geissonator)
4917037f1SAndrew Geissler
5917037f1SAndrew GeisslerPrimary assignee: Andrew Geissler (geissonator)
6917037f1SAndrew Geissler
7917037f1SAndrew GeisslerOther contributors:
8917037f1SAndrew Geissler
9917037f1SAndrew GeisslerCreated: June 23rd, 2020
10917037f1SAndrew Geissler
11917037f1SAndrew Geissler## Problem Description
12917037f1SAndrew GeisslerBMCs can reboot for a lot of different reasons. User request, firmware update,
13917037f1SAndrew Geisslerand a variety of different error scenarios. When the BMC is rebooted while the
14917037f1SAndrew Geisslerhost is up and running, there needs to be a process by which the two synchronize
15917037f1SAndrew Geisslerwith each other and the BMC gets itself into a state that matches with the host.
16917037f1SAndrew Geissler
17917037f1SAndrew Geissler## Background and References
18917037f1SAndrew GeisslerA good portion of this is explained in the phosphor-state-manager [README][1].
19917037f1SAndrew Geissler
20917037f1SAndrew GeisslerThis design doc is written to formalize the design and add some more
21917037f1SAndrew Geisslerdetails on dealing with both IPMI and PLDM communication to the host as well as
22917037f1SAndrew Geisslerdesired behavior when unable to talk with the host.
23917037f1SAndrew Geissler
24917037f1SAndrew GeisslerThe high level flow is that OpenBMC software first checks for pgood, and if set,
25917037f1SAndrew Geisslerit then checks the host to see if it is up and alive. If the power is on and the
26917037f1SAndrew Geisslerhost is running, then files are created in the filesystem to indicate this:
27917037f1SAndrew Geissler- /run/openbmc/chassis@0-on
28917037f1SAndrew Geissler- /run/openbmc/host@0-on
29917037f1SAndrew Geissler
30917037f1SAndrew GeisslerIt should be noted that although full support is not in place for multi-chassis
31917037f1SAndrew Geisslerand multi-host systems, the framework is there to build on.
32917037f1SAndrew Geissler`op-reset-chassis-running@.service` is a templated service, checking pgood in
33917037f1SAndrew Geisslerits instances power domain. It creates a file in the filesystem,
34917037f1SAndrew Geissler/run/openbmc/chassis@%i-on, to indicate power is on for that instance. Similar
35917037f1SAndrew Geisslerimplementation is done for the host via `phosphor-reset-host-check@.service` and
36917037f1SAndrew Geisslerthe file /run/openbmc/host@%i-on.
37917037f1SAndrew Geissler
38917037f1SAndrew GeisslerIf chassis power is on and the host is up, then
39917037f1SAndrew Geissler`obmc-chassis-poweron@.target` and `obmc-host-start@.target` are started.
40917037f1SAndrew Geissler
41917037f1SAndrew GeisslerThe /run/ files are used by OpenBMC services to determine if they need to run
42917037f1SAndrew Geissleror not when the chassis and host targets are started. For example, if the
43917037f1SAndrew Geisslerchassis is already powered on and the host is running, there is no need to
44917037f1SAndrew Geissleractually turn power on, or start the host. The behavior wanted is that these
45917037f1SAndrew Geisslerservices "start" but do nothing. That is commonly done within a systemd service
46917037f1SAndrew Geisslerfile via the following:
47917037f1SAndrew Geissler- `ConditionPathExists=!/run/openbmc/chassis@%i-on`
48917037f1SAndrew Geissler- `ConditionPathExists=!/run/openbmc/host@%i-on`
49917037f1SAndrew Geissler
50917037f1SAndrew GeisslerThis allows the targets to start and for the BMC to get in synch with the
51917037f1SAndrew Geisslerstate of the chassis and host without any special software checks.
52917037f1SAndrew Geissler
53917037f1SAndrew GeisslerDifferent systems have different requirements on what the behavior should be
54917037f1SAndrew Geisslerwhen the chassis power is on, but the host is unreachable. This design needs to
55917037f1SAndrew Geisslerallow this customization. For example, some systems would prefer to just leave
56917037f1SAndrew Geisslerthe system in whatever state it is in and let the user correct things. Some
57917037f1SAndrew Geisslersystems want to recover automatically (i.e. reboot the host) for the user.
58917037f1SAndrew GeisslerSome systems have a hybrid approach where depending on where the host was in
59917037f1SAndrew Geisslerits boot, the BMC may leave it or recover.
60917037f1SAndrew Geissler
61917037f1SAndrew Geissler## Requirements
62917037f1SAndrew Geissler- Support both IPMI and PLDM as mechanisms to determine if the host is running.
63917037f1SAndrew Geissler  - Allow either or both to be enabled
64917037f1SAndrew Geissler- Support custom behavior when chassis power is on but the BMC is unable to
65917037f1SAndrew Geissler  communicate with the host
66917037f1SAndrew Geissler- Both IPMI and PLDM stacks will give the host a set amount of time to
67917037f1SAndrew Geissler  respond. Lack of response within this time limit will result in the BMC
68917037f1SAndrew Geissler  potentially taking recovery actions.
69917037f1SAndrew Geissler  - This time limit must be configurable at build time
70d101b076SAndrew Geissler- IPMI and PLDM will implement a phosphor-dbus-interface interface,
71d101b076SAndrew Geissler  `xyz.openbmc_project.Condition.HostFirmware`, which will have a
72d101b076SAndrew Geissler  `CurrentFirmwareCondition` property which other applications can read to
73d101b076SAndrew Geissler  determine if the host is running.
74917037f1SAndrew Geissler
75917037f1SAndrew Geissler### IPMI Detailed Requirements
76917037f1SAndrew Geissler- IPMI will continue to utilize the SMS_ATN command to indicate to the host that
77917037f1SAndrew Geissler  a "Get Message Flags Command" is requested. Upon the host issuing that
78917037f1SAndrew Geissler  command, it will be considered up and running
79917037f1SAndrew Geissler
80917037f1SAndrew Geissler### PLDM Detailed Requirements
81d101b076SAndrew Geissler- PLDM will utilize a GetTID command to the host to determine if it is running
82917037f1SAndrew Geissler- Where applicable, PLDM will provide a mechanism to distinguish between
83917037f1SAndrew Geissler  different host firmware stacks
84917037f1SAndrew Geissler  - For example, on IBM systems there is a difference between the
85917037f1SAndrew Geissler    hostboot (host initialization) firmware and Hypervisor firmware stacks.
86917037f1SAndrew Geissler    Both are host firmware and talking PLDM but the BMC recovery paths will
87d101b076SAndrew Geissler    differ based on which is running. The `CurrentFirmwareCondition` property
88d101b076SAndrew Geissler    should not return "Running" unless the Hypervisor firmware is running.
89917037f1SAndrew Geissler
90917037f1SAndrew Geissler## Proposed Design
91917037f1SAndrew GeisslerHigh Level Flow:
92917037f1SAndrew Geissler- Check pgood
93d101b076SAndrew Geissler- Call mapper for all implementations of
94d101b076SAndrew Geissler  `xyz.openbmc_project.Condition.HostFirmware` PDI interface
95d101b076SAndrew Geissler- Read `CurrentFirmwareCondition` property of all interface. If any call returns
96d101b076SAndrew Geissler  that a host is running then create file and start host target.
97d101b076SAndrew Geissler- Otherwise, check host via any custom mechanisms
98917037f1SAndrew Geissler- Execute automated recovery of host if desired
99917037f1SAndrew Geissler
100917037f1SAndrew GeisslerIPMI and PLDM software will be started as applicable. A combination of systemd
101917037f1SAndrew Geisslerservices and applications within phosphor-state-manager will coordinate the
102917037f1SAndrew Geisslerchecking of pgood, and if set, request the IPMI and PLDM applications to
103917037f1SAndrew Geisslerdiscover if the host is running. Based on the response from these queries
104917037f1SAndrew Geisslerthe software in phosphor-state-manager will take the appropriate action of
105917037f1SAndrew Geisslercreating the /run files and starting the chassis and host targets or entering
106917037f1SAndrew Geisslerinto recovery of the host.
107917037f1SAndrew Geissler
108917037f1SAndrew GeisslerThe systemd targets responsible for this and any common services will be hosted
109917037f1SAndrew Geisslerwithin phosphor-state-manager. Any system or company specific services can
110917037f1SAndrew Geisslerbe installed in the common targets:
111917037f1SAndrew Geissler- obmc-chassis-powerreset@.target.require
112917037f1SAndrew Geissler- obmc-host-reset@.target.requires
113917037f1SAndrew Geissler
114*d5a79731SAndrew Geissler### Automated Recovery when host does not respond
115*d5a79731SAndrew Geissler
116*d5a79731SAndrew GeisslerA separate service and application will be created within phosphor-state-manager
117*d5a79731SAndrew Geisslerto execute the following logic in situations where chassis power is on
118*d5a79731SAndrew Geisslerbut the host has failed to respond to any of the different mechanisms to
119*d5a79731SAndrew Geisslercommunicate with it:
120*d5a79731SAndrew Geissler- If chassis power on (/run/openbmc/chassis@%i-on)
121*d5a79731SAndrew Geissler- And host is off (!ConditionPathExists=!/run/openbmc/host@%i-on)
122*d5a79731SAndrew Geissler- And restored BootProgress is not None
123*d5a79731SAndrew Geissler- Then (assume host was booting before BMC reboot)
124*d5a79731SAndrew Geissler  - Log error indicating situation
125*d5a79731SAndrew Geissler  - Move host to Quiesce and allow automated recovery to kick in
126*d5a79731SAndrew Geissler
127917037f1SAndrew Geissler### Note on custom mechanism for IBM systems
128917037f1SAndrew GeisslerIBM systems will utilize a processor CFAM register. The specific register is
129917037f1SAndrew Geissler**Mailbox scratch register 12**.
130917037f1SAndrew Geissler
131917037f1SAndrew GeisslerIf the chassis power is on but the BMC is unable to communicate with the
132917037f1SAndrew Geisslerhost via IPMI or PLDM, then the BMC will read this processor CFAM register.
133917037f1SAndrew Geissler
134917037f1SAndrew GeisslerThe Host code will write `0xA5000001` to this register to indicate when it has
135917037f1SAndrew Geisslerreached a state in which it can boot an operating system without needing the
136917037f1SAndrew GeisslerBMC. If the BMC sees this value written in the CFAM register, then it will leave
137917037f1SAndrew Geisslerthe host as-is, even if it is unable to communicate with the host over IPMI or
138917037f1SAndrew GeisslerPLDM. It will log an error indicating it was unable to communicate with the host
139917037f1SAndrew Geisslerbut it will also show the host state as `Running`.
140917037f1SAndrew Geissler
141917037f1SAndrew GeisslerIf the register is not `0xA5000001`, then the BMC will follow whatever recovery
142917037f1SAndrew Geisslermechanisms are defined for when the host has a failure (most likely a reboot
143917037f1SAndrew Geisslerof the host).
144917037f1SAndrew Geissler
145917037f1SAndrew GeisslerIt is the responsibility of the host firmware to set this register as
146917037f1SAndrew Geisslerapplicable during the boot of the system. Host firmware will clear this register
147917037f1SAndrew Geisslerin shutdown scenarios. To handle different host crash scenarios, the register
148917037f1SAndrew Geisslerwill also be cleared by the BMC firmware on power off's, system checkstops, and
149917037f1SAndrew Geisslerduring Memory Preserving reboots.
150917037f1SAndrew Geissler
151917037f1SAndrew Geissler
152917037f1SAndrew Geissler## Alternatives Considered
153917037f1SAndrew GeisslerOne thought was to avoid IPMI/PLDM all together and only use a "lowest common
154917037f1SAndrew Geisslerdenominator" hardware register of some sort. The problem with that is you start
155917037f1SAndrew Geisslercreating your own protocol, and before you know it you have something like IPMI
156917037f1SAndrew Geissleror PLDM anyway, except you created something custom. So 99% of the time the
157917037f1SAndrew GeisslerIPMI or PLDM path will be fine, and as a backup option, system owners can
158917037f1SAndrew Geisslerput their own custom host-detection applications in.
159917037f1SAndrew Geissler
160917037f1SAndrew Geissler## Impacts
161917037f1SAndrew GeisslerNone
162917037f1SAndrew Geissler
163917037f1SAndrew Geissler## Testing
164917037f1SAndrew GeisslerThe normal path of IPMI and PLDM will be simple to test. Boot your system and
165917037f1SAndrew Geisslerreboot the BMC. Verify the BMC chassis and host states are correct and verify
166917037f1SAndrew Geisslerthe host continued to run without issue throughout the BMC reset.
167917037f1SAndrew Geissler
168917037f1SAndrew GeisslerThe more complicated tests will be error paths:
169917037f1SAndrew Geissler- Reboot the BMC while the host is booting, but before it's in a state where
170917037f1SAndrew Geissler  it can continue to run without the BMC. Verify the BMC detects this error
171917037f1SAndrew Geissler  scenario once it comes back from its reboot and takes the proper recovery
172917037f1SAndrew Geissler  actions for the host.
173917037f1SAndrew Geissler- Reboot the BMC when the host is up and running but disable the IPMI/PLDM stack
174917037f1SAndrew Geissler  on the host so it does not respond to the BMC when it comes back from its
175917037f1SAndrew Geissler  reboot. Ensure the BMC follows the defined recovery for the system in this
176917037f1SAndrew Geissler  situation.
177917037f1SAndrew Geissler
178917037f1SAndrew Geissler[1]: https://github.com/openbmc/phosphor-state-manager#bmc-reset-with-host-andor-chassis-on
179