xref: /openbmc/openpower-hw-diags/attn/Attention_Handler.md (revision e693e21fc7c8ad8f13163dacc244f7865e0e100e)
1# Attention Handler
2
3## Introduction
4
5An attention is a hardware, firmware or software alert mechanism used to request
6service from an Attention Handler via an attention signal (e.g. a GPIO). An
7attention handler is, in this case, a stateless service that handles attention
8requests. The attention handler in combination with a hardware analyzer
9constitutes the Openpower Hardware Diagnostics (hardware diags) component.
10Hardware diags can operate in daemon mode and in application mode. When
11operating in daemon mode openpower hardware diags may be referred to as the
12attention handler or attention handler service. When operating in application
13mode openpower hardware diags may be referred to as the analyzer. Both modes
14executing simultaneously is supported however. There can be at most one instance
15of the attention handler service, per attention signal.
16
17## Overview
18
19The main role of the attention handler is a long running process monitoring the
20attention interrupt signal and delegating tasks to external components to aid in
21handling attention requests. The attention handler is loaded into memory when
22the host is started and monitors the attention signal until the host is stopped.
23When an attention (signal) is detected by the attention handler (monitor) the
24attention handler will query the state of the host to determine the nature of
25the active attention(s).
26
27The attention handler services four types of attentions namely, and in order of
28priority, Vital Attention (vital), Terminate Immediately (TI), Breakpoint
29Attention (BP) and Checkstop Attention (checkstop). TI and BP attentions are
30categorically called Special Attentions (special) and are mutually exclusive.
31Additionally a TI can be raised by either hostboot (HBTI) or the hypervisor
32(PHYPTI).
33
34The general handling of attentions is as follows:
35
36- vital: log an event, request a SBE dump and request a re-ipl of the host.
37- PHYPTI: log an event and request a memory preserving IPL (MPIPL) of the host.
38- HBTI: log an event, request a hardware dump and request a re-ipl of the host.
39- BP: notify the debug agent (e.g. Cronus)
40- checkstop: log an event, call the analyzer, request a system dump and request
41  a re-ipl of the host.
42
43## Implementation Details
44
45### External Components
46
47The attention handler relies on several external components and BMC services for
48the handling of attentions. Among these are systemd, dbus, phosphor-logging and
49PDBG.
50
51The general use of these components are as follows:
52
53- systemd: loading and unloading of the attention handler service.
54- dbus: querying the state of the BMC and host, requesting dumps and requesting
55  host IPL's
56- phosphor-logging: debug tracing and logging platform events.
57- PDBG: querying host hardware.
58
59### Component Usage
60
61- Platform Event Log (PEL) entry creation and debug tracing: phosphor-logging
62  component.
63- Host Transition (re-IPL, MPIPL request): dbus interface.
64- Attention handler starting and stopping: systemd.
65
66### Starting and Stopping
67
68The attention handler service is started via the BMC systemd infrastructure when
69the host is started. When the attention handler is started it will register
70itself as the GPIO event handler associated with the attention GPIO and remain
71resident in memory waiting for attentions. The attention handler will be stopped
72via the BMC systemd infrastructure when the host is stopped.
73
74If the attention handler terminates due to an error condition it will
75automatically be restarted by systemd. The attention handler restart policy is
76based on the BMC default restart policy (n-number of restarts in n-seconds
77maximum).
78
79### Attentions
80
81When the attention signal becomes active the attention handler will begin
82handling attentions. Using the PDBG interface the attention handler will query
83the host to determine the reason for the active attention handler signal. Each
84enabled processor will be queried and a map of active attentions will be
85created. The first attention of highest priority will be the one that is
86serviced (see Overview for attention type and priority). The Global Interrupt
87Status Register in combination with an associated attention mask register is
88used to determine active attentions.
89
90#### Vital Attention
91
92A vital attention is handled by generating an event log via the phosphor-logging
93inteface and then using the dump manager dbus interface to request a hardware
94dump. The attention handler will wait, up to one hour, for the dump to complete
95and then use the systemd interface to request a re-IPL of the host. This type of
96attention indicates that there was a problem with the Self Boot Engine (SBE)
97hardware.
98
99#### Special Attentions
100
101Three types of attentions, HBTI, PHYPTI and BP, share a single attention status
102flag (special attention) in the global interrupt status register. In order to
103determine the specific type of special attention the attention handler relies on
104a portion of shared memory called the Shared TI Information Data Area (TI info).
105The attention handler gains access to this memory region by submitting a request
106for its memory address through the PDBG interface. The request is via a Chip
107Operation (chipop) command.
108
109##### HBTI
110
111These attention types are indications from hostboot that an error has occurred
112or a shutdown has been requested. When servicing a HBTI the attention handler
113will consider two cases namely a TI with SRC and a TI with EID.
114
115For a TI with SRC, using the TI info, the attention handler will create and
116submit a PEL entry on behalf of hostboot, request a hostboot dump and upon
117completion, or timeout, request a re-IPL of the host using the dbus interface.
118Optionally, based upon flags in the TI info, the attention handler may mark the
119PEL entry as hidden.
120
121For a TI with EID, the attention handler will, based upon flags in the TI info,
122request a hostboot dump using the dbus interface. Upon completion, or timeout,
123of the dump the attention handler will request a re-IPL of the host via the dbus
124interface. For a TI with EID the attention handler will forgo creating a PEL
125entry (hostboot has already done this).
126
127##### PHYPTI
128
129These attentions are indications from the hypervisor that an error has occurred.
130When servicing a PHYPTI the attention handler will generate a PEL entry and then
131request a host MPIPL using the PDBG interface (chipop).
132
133##### BP (breakpoint)
134
135These attentions are used to signal to the attention handler that it should
136notify the debug agent (e.g. Cronus) that a debug breakpoint has been
137encountered. If the TI info data is considered valid and neither a HBTI or
138PHYPTI is determined to be the cause of the special attention then a BP
139attention is assumed. When a BP attention occurs the attention handler will use
140the dbus interface to notify the debug agent that a breakpoint condition was
141encountered. The attention handler will then go back to listening for
142attentions.
143
144##### Recoverables
145
146These attentions are indications that recoverable errors have been encountered.
147These attentions do not generate an attention GPIO event however when servicing
148a TI the attention handler will note whether or not the host is reporting
149recoverable errors. If recoverable errors are present the attention handler will
150call the analyzer before requesting a dump or IPL. The check for recoverable
151errors is done using the PDBG interface (the Global Interrupt Status Register).
152
153#### Checkstop
154
155These attentions indicate that a hardware error has occurred and further
156hardware analyses is required to determine root cause. When servicing a
157checkstop attention the attention handler will call the analyzer and then wait
158for analyses to complete. Once analyses have completed the attention handler
159will use the dbus interface to request a system dump and upon completetion, or
160timeout, will request a re-IPL of the host.
161
162### Configuration
163
164Some attention handler behavior can be configured by passing parameters to the
165service using the service file or using the command line. Currently the
166following behaviors are configurable:
167
168- vital handling enable/disable, default enable
169- TI handling enable/disable, default enable
170- BP handling enable/disable, default enable
171- checkstop handling enable/disable, default enable
172- special attention default BP/TI, default BP
173
174### Additional Considerations
175
176Regardless of the type of special attention the attention handler will always
177create at least one PEL entry containing attention handler specific FFDC. In
178some cases this entry may be informational.
179
180Regardless of the type of attention, after servicing the attention the attention
181handler will return to listening for attentions. In most cases no more
182attentions will be detected unless the currently active attentions are cleared.
183Normally active attentions will be cleared by the host re-IPL but any component
184could clear the active attentions as is the case for BP attentions and the debug
185agent.
186
187If the attention handler encounters any errors while servicing an interrupt it
188will generate a PEL entry to indicate the error that was encountered.
189