1# BMC Reset with Host Booted
2
3Author: Andrew Geissler (geissonator)
4
5Primary assignee: Andrew Geissler (geissonator)
6
7Other contributors:
8
9Created: June 23rd, 2020
10
11## Problem Description
12BMCs can reboot for a lot of different reasons. User request, firmware update,
13and a variety of different error scenarios. When the BMC is rebooted while the
14host is up and running, there needs to be a process by which the two synchronize
15with each other and the BMC gets itself into a state that matches with the host.
16
17## Background and References
18A good portion of this is explained in the phosphor-state-manager [README][1].
19
20This design doc is written to formalize the design and add some more
21details on dealing with both IPMI and PLDM communication to the host as well as
22desired behavior when unable to talk with the host.
23
24The high level flow is that OpenBMC software first checks for pgood, and if set,
25it then checks the host to see if it is up and alive. If the power is on and the
26host is running, then files are created in the filesystem to indicate this:
27- /run/openbmc/chassis@0-on
28- /run/openbmc/host@0-on
29
30It should be noted that although full support is not in place for multi-chassis
31and multi-host systems, the framework is there to build on.
32`op-reset-chassis-running@.service` is a templated service, checking pgood in
33its instances power domain. It creates a file in the filesystem,
34/run/openbmc/chassis@%i-on, to indicate power is on for that instance. Similar
35implementation is done for the host via `phosphor-reset-host-check@.service` and
36the file /run/openbmc/host@%i-on.
37
38If chassis power is on and the host is up, then
39`obmc-chassis-poweron@.target` and `obmc-host-start@.target` are started.
40
41The /run/ files are used by OpenBMC services to determine if they need to run
42or not when the chassis and host targets are started. For example, if the
43chassis is already powered on and the host is running, there is no need to
44actually turn power on, or start the host. The behavior wanted is that these
45services "start" but do nothing. That is commonly done within a systemd service
46file via the following:
47- `ConditionPathExists=!/run/openbmc/chassis@%i-on`
48- `ConditionPathExists=!/run/openbmc/host@%i-on`
49
50This allows the targets to start and for the BMC to get in synch with the
51state of the chassis and host without any special software checks.
52
53Different systems have different requirements on what the behavior should be
54when the chassis power is on, but the host is unreachable. This design needs to
55allow this customization. For example, some systems would prefer to just leave
56the system in whatever state it is in and let the user correct things. Some
57systems want to recover automatically (i.e. reboot the host) for the user.
58Some systems have a hybrid approach where depending on where the host was in
59its boot, the BMC may leave it or recover.
60
61## Requirements
62- Support both IPMI and PLDM as mechanisms to determine if the host is running.
63  - Allow either or both to be enabled
64- Support custom behavior when chassis power is on but the BMC is unable to
65  communicate with the host
66- Both IPMI and PLDM stacks will give the host a set amount of time to
67  respond. Lack of response within this time limit will result in the BMC
68  potentially taking recovery actions.
69  - This time limit must be configurable at build time
70
71### IPMI Detailed Requirements
72- IPMI will continue to utilize the SMS_ATN command to indicate to the host that
73  a "Get Message Flags Command" is requested. Upon the host issuing that
74  command, it will be considered up and running
75
76### PLDM Detailed Requirements
77- PLDM will do a PDR exchange with the host to regenerate the BMC PDR's
78- Where applicable, PLDM will provide a mechanism to distinguish between
79  different host firmware stacks
80  - For example, on IBM systems there is a difference between the
81    hostboot (host initialization) firmware and Hypervisor firmware stacks.
82    Both are host firmware and talking PLDM but the BMC recovery paths will
83    differ based on which is running
84
85## Proposed Design
86High Level Flow:
87- Check pgood
88- Check host via IPMI (enabled via compile option)
89- Check host via PLDM (enabled via compile option)
90- Check host via any custom mechanisms
91- Execute automated recovery of host if desired
92
93IPMI and PLDM software will be started as applicable. A combination of systemd
94services and applications within phosphor-state-manager will coordinate the
95checking of pgood, and if set, request the IPMI and PLDM applications to
96discover if the host is running. Based on the response from these queries
97the software in phosphor-state-manager will take the appropriate action of
98creating the /run files and starting the chassis and host targets or entering
99into recovery of the host.
100
101The systemd targets responsible for this and any common services will be hosted
102within phosphor-state-manager. Any system or company specific services can
103be installed in the common targets:
104- obmc-chassis-powerreset@.target.require
105- obmc-host-reset@.target.requires
106
107### Note on custom mechanism for IBM systems
108IBM systems will utilize a processor CFAM register. The specific register is
109**Mailbox scratch register 12**.
110
111If the chassis power is on but the BMC is unable to communicate with the
112host via IPMI or PLDM, then the BMC will read this processor CFAM register.
113
114The Host code will write `0xA5000001` to this register to indicate when it has
115reached a state in which it can boot an operating system without needing the
116BMC. If the BMC sees this value written in the CFAM register, then it will leave
117the host as-is, even if it is unable to communicate with the host over IPMI or
118PLDM. It will log an error indicating it was unable to communicate with the host
119but it will also show the host state as `Running`.
120
121If the register is not `0xA5000001`, then the BMC will follow whatever recovery
122mechanisms are defined for when the host has a failure (most likely a reboot
123of the host).
124
125It is the responsibility of the host firmware to set this register as
126applicable during the boot of the system. Host firmware will clear this register
127in shutdown scenarios. To handle different host crash scenarios, the register
128will also be cleared by the BMC firmware on power off's, system checkstops, and
129during Memory Preserving reboots.
130
131
132## Alternatives Considered
133One thought was to avoid IPMI/PLDM all together and only use a "lowest common
134denominator" hardware register of some sort. The problem with that is you start
135creating your own protocol, and before you know it you have something like IPMI
136or PLDM anyway, except you created something custom. So 99% of the time the
137IPMI or PLDM path will be fine, and as a backup option, system owners can
138put their own custom host-detection applications in.
139
140## Impacts
141None
142
143## Testing
144The normal path of IPMI and PLDM will be simple to test. Boot your system and
145reboot the BMC. Verify the BMC chassis and host states are correct and verify
146the host continued to run without issue throughout the BMC reset.
147
148The more complicated tests will be error paths:
149- Reboot the BMC while the host is booting, but before it's in a state where
150  it can continue to run without the BMC. Verify the BMC detects this error
151  scenario once it comes back from its reboot and takes the proper recovery
152  actions for the host.
153- Reboot the BMC when the host is up and running but disable the IPMI/PLDM stack
154  on the host so it does not respond to the BMC when it comes back from its
155  reboot. Ensure the BMC follows the defined recovery for the system in this
156  situation.
157
158[1]: https://github.com/openbmc/phosphor-state-manager#bmc-reset-with-host-andor-chassis-on
159