1# Platform Event Log Message Registry
2On the BMC, PELs are created from the standard event logs provided by
3phosphor-logging using a message registry that provides the PEL related fields.
4The message registry is a JSON file.
5
6## Contents
7* [Component IDs](#component-ids)
8* [Message Registry](#message-registry-fields)
9* [Modifying and Testing](#modifying-and-testing)
10
11## Component IDs
12A component ID is a 2 byte value of the form 0xYY00 used in a PEL to:
131. Provide the upper byte (the YY from above) of an SRC reason code in `BD`
14   SRCs.
152. Reside in the section header of the Private Header PEL section to specify
16   the error log creator's component ID.
173. Reside in the section header of the User Header section to specify the error
18   log committer's component ID.
194. Reside in the section header in the User Data section to specify which
20   parser to call to parse that section.
21
22Component IDs are specified in the message registry either as the upper byte of
23the SRC reason code field for `BD` SRCs, or in the standalone `ComponentID`
24field.
25
26Component IDs will be unique on a per-repository basis for errors unique to
27that repository.  When the same errors are created by multiple repositories,
28those errors will all share the same component ID.  The master list of
29component IDs is [here](ComponentIDs.md).
30
31## Message Registry Fields
32The message registry schema is [here](schema/schema.json), and the message
33registry itself is [here](message_registry.json).  The schema will be validated
34either during a bitbake build or during CI, or eventually possibly both.
35
36In the message registry, there are fields for specifying:
37
38### Name
39This is the key into the message registry, and is the Message property
40of the OpenBMC event log that the PEL is being created from.
41
42```
43"Name": "xyz.openbmc_project.Power.Fault"
44```
45
46### Subsystem
47This field is part of the PEL User Header section, and is used to specify
48the subsystem pertaining to the error.  It is an enumeration that maps to the
49actual PEL value.  If the subsystem isn't known ahead of time, it can be passed
50in at the time of PEL creation using the 'PEL\_SUBSYSTEM' AdditionalData field.
51In this case, 'Subsystem' isn't required, though 'PossibleSubsystems' is.
52
53```
54"Subsystem": "power_supply"
55```
56
57### PossibleSubsystems
58This field is used by scripts that build documentation from the message
59registry to know which subsystems are possible for an error when it can't be
60hardcoded using the 'Subsystem' field.  It is mutually exclusive with the
61'Subsystem' field.
62
63```
64"PossibleSubsystems": ["memory", "processor"]
65```
66
67### Severity
68This field is part of the PEL User Header section, and is used to specify
69the PEL severity.  It is an optional field, if it isn't specified, then the
70severity of the OpenBMC event log will be converted into a PEL severity value.
71
72It can either be the plain severity value, or an array of severity values that
73are based on system type, where an entry without a system type will match
74anything unless another entry has a matching system type.
75
76```
77"Severity": "unrecoverable"
78```
79
80```
81Severity":
82[
83    {
84        "System": "system1",
85        "SevValue": "recovered"
86    },
87    {
88        "Severity": "unrecoverable"
89    }
90]
91```
92The above example shows that on system 'system1' the severity will be
93recovered, and on every other system it will be unrecoverable.
94
95### Mfg Severity
96This is an optional field and is used to override the Severity field when a
97specific manufacturing isolation mode is enabled.  It has the same format as
98Severity.
99
100```
101"MfgSeverity": "unrecoverable"
102```
103
104### Event Scope
105This field is part of the PEL User Header section, and is used to specify
106the event scope, as defined by the PEL spec.  It is optional and defaults to
107"entire platform".
108
109```
110"EventScope": "entire_platform"
111```
112
113### Event Type
114This field is part of the PEL User Header section, and is used to specify
115the event type, as defined by the PEL spec.  It is optional and defaults to
116"not applicable" for non-informational logs, and "misc_information_only" for
117informational ones.
118
119```
120"EventType": "na"
121```
122
123### Action Flags
124This field is part of the PEL User Header section, and is used to specify the
125PEL action flags, as defined by the PEL spec.  It is an array of enumerations.
126
127The action flags can usually be deduced from other PEL fields, such as the
128severity or if there are any callouts.  As such, this is an optional field and
129if not supplied the code will fill them in based on those fields.
130
131In fact, even if supplied here, the code may still modify them to ensure they
132are correct.  The rules used for this are
133[here](../README.md#action-flags-and-event-type-rules).
134
135```
136"ActionFlags": ["service_action", "report", "call_home"]
137```
138
139### Mfg Action Flags
140This is an optional field and is used to override the Action Flags field when a
141specific manufacturing isolation mode is enabled.
142
143```
144"MfgActionFlags": ["service_action", "report", "call_home"]
145```
146
147### Component ID
148This is the component ID of the PEL creator, in the form 0xYY00.  For `BD`
149SRCs, this is an optional field and if not present the value will be taken from
150the upper byte of the reason code.  If present for `BD` SRCs, then this byte
151must match the upper byte of the reason code.
152
153```
154"ComponentID": "0x5500"
155```
156
157### SRC Type
158This specifies the type of SRC to create.  The type is the first 2 characters
159of the 8 character ASCII string field of the PEL.  The allowed types are `BD`,
160for the standard OpenBMC error, and `11`, for power related errors.  It is
161optional and if not specified will default to `BD`.
162
163Note: The ASCII string for BD SRCs looks like: `BDBBCCCC`, where:
164* BD = SRC type
165* BB = PEL subsystem as mentioned above
166* CCCC SRC reason code
167
168For `11` SRCs, it looks like: `1100RRRR`, where RRRR is the SRC reason code.
169
170```
171"Type": "11"
172```
173
174### SRC Reason Code
175This is the 4 character value in the latter half of the SRC ASCII string.  It
176is treated as a 2 byte hex value, such as 0x5678.  For `BD` SRCs, the first
177byte is the same as the first byte of the component ID field in the Private
178Header section that represents the creator's component ID.
179
180```
181"ReasonCode": "0x5544"
182```
183
184### SRC Symptom ID Fields
185The symptom ID is in the Extended User Header section and is defined in the PEL
186spec as the unique event signature string.  It always starts with the ASCII
187string.  This field in the message registry allows one to choose which SRC words
188to use in addition to the ASCII string field to form the symptom ID. All words
189are separated by underscores.  If not specified, the code will choose a default
190format, which may depend on the SRC type.
191
192For example: ["SRCWord3", "SRCWord9"] would be:
193`<ASCII_STRING>_<SRCWord3>_<SRCWord9>`, which could look like:
194`B181320_00000050_49000000`.
195
196```
197"SymptomIDFields": ["SRCWord3", "SRCWord9"]
198```
199
200### SRC words 6 to 9
201In a PEL, these SRC words are free format and can be filled in by the user as
202desired.  On the BMC, the source of these words is the AdditionalData fields in
203the event log.  The message registry provides a way for the log creator to
204specify which AdditionalData property field to get the data from, and also to
205define what the SRC word means for use by parsers.  If not specified, these SRC
206words will be set to zero in the PEL.
207
208```
209"Words6to9":
210{
211    "6":
212    {
213        "description": "Failing unit number",
214        "AdditionalDataPropSource": "PS_NUM"
215    }
216}
217```
218
219### Documentation Fields
220The documentation fields are used by PEL parsers to display a human readable
221description of a PEL.  They are also the source for the Redfish event log
222messages.
223
224#### Message
225This field is used by the BMC's PEL parser as the description of the error log.
226It will also be used in Redfish event logs.  It supports argument substitution
227using the %1, %2, etc placeholders allowing any of the SRC user data words 6 -
2289 to be displayed as part of the message.  If the placeholders are used, then
229the `MessageArgSources` property must be present to say which SRC words to use
230for each placeholder.
231
232```
233"Message": "Processor %1 had %2 errors"
234```
235
236#### MessageArgSources
237This optional field is required when the Message field contains the %X
238placeholder arguments. It is an array that says which SRC words to get the
239placeholders from.  In the example below, SRC word 6 would be used for %1, and
240SRC word 7 for %2.
241
242```
243"MessageArgSources":
244[
245    "SRCWord6", "SRCWord7"
246]
247```
248
249#### Description
250A short description of the error.  This is required by the Redfish schema to generate a Redfish message entry, but is not used in Redfish or PEL output.
251
252```
253"Description": "A power fault"
254```
255
256#### Notes
257This is an optional free format text field for keeping any notes for the
258registry entry, as comments are not allowed in JSON.  It is an array of strings
259for easier readability of long fields.
260
261```
262"Notes": [
263    "This entry is for every type of power fault.",
264    "There is probably a hardware failure."
265]
266```
267
268### Callout Fields
269The callout fields allow one to specify the PEL callouts (either a hardware
270FRU, a symbolic FRU, or a maintenance procedure) in the entry for a particular
271error.  These callouts can vary based on system type, as well as a user
272specified AdditionalData property field.   Callouts will be added to the PEL in
273the order they are listed in the JSON.  If a callout is passed into the error,
274say with CALLOUT_INVENTORY_PATH, then that callout will be added to the PEL
275before the callouts in the registry.
276
277There is room for up to 10 callouts in a PEL.
278
279#### Callouts example based on the system type
280
281```
282"Callouts":
283[
284    {
285        "System": "system1",
286        "CalloutList":
287        [
288            {
289                "Priority": "high",
290                "LocCode": "P1-C1"
291            },
292            {
293                "Priority": "low",
294                "LocCode": "P1"
295            }
296        ]
297    },
298    {
299        "CalloutList":
300        [
301            {
302                "Priority": "high",
303                "Procedure": "SVCDOCS"
304            }
305        ]
306
307    }
308]
309
310```
311
312The above example shows that on system 'system1', the FRU at location P1-C1
313will be called out with a priority of high, and the FRU at P1 with a priority
314of low.  On every other system, the maintenance procedure SVCDOCS is called
315out.
316
317#### Callouts example based on an AdditionalData field
318
319```
320"CalloutsUsingAD":
321{
322    "ADName": "PROC_NUM",
323    "CalloutsWithTheirADValues":
324    [
325        {
326            "ADValue": "0",
327            "Callouts":
328            [
329                {
330                    "CalloutList":
331                    [
332                        {
333                            "Priority": "high",
334                            "LocCode": "P1-C5"
335                        }
336                    ]
337                }
338            ]
339        },
340        {
341            "ADValue": "1",
342            "Callouts":
343            [
344                {
345                    "CalloutList":
346                    [
347                        {
348                            "Priority": "high",
349                            "LocCode": "P1-C6"
350                        }
351                    ]
352                }
353            ]
354        }
355    ]
356}
357
358```
359
360This example shows that the callouts were selected based on the 'PROC_NUM'
361AdditionalData field.  When PROC_NUM was 0, the FRU at P1-C5 was called out.
362When it was 1, P1-C6 was called out.  Note that the same 'Callouts' array is
363used as in the previous example, so these callouts can also depend on the
364system type.
365
366#### CalloutType
367This field can be used to modify the failing component type field in the
368callout when the default doesn\'t fit:
369
370```
371{
372
373    "Priority": "high",
374    "Procedure": "FIXIT22"
375    "CalloutType": "config_procedure"
376}
377```
378
379The defaults are:
380- Normal hardware FRU: hardware_fru
381- Symbolic FRU: symbolic_fru
382- Procedure: maint_procedure
383
384#### Symbolic FRU callouts with dynamic trusted location codes
385
386A special case is when one wants to use a symbolic FRU callout with a trusted
387location code, but the location code to use isn\'t known until runtime. This
388means it can\'t be specified using the 'LocCode' key in the registry.
389
390In this case, one should use the 'SymbolicFRUTrusted' key along with the
391'UseInventoryLocCode' key, and then pass in the inventory item that has the
392desired location code using the 'CALLOUT_INVENTORY_PATH' entry inside of the
393AdditionalData property.  The code will then look up the location code for that
394passed in inventory FRU and place it in the symbolic FRU callout.  The normal
395FRU callout with that inventory item will not be created.  The symbolic FRU
396must be the first callout in the registry for this to work.
397
398```
399{
400
401    "Priority": "high",
402    "SymbolicFRUTrusted": "AIR_MOVR",
403    "UseInventoryLocCode": true
404}
405```
406
407## Modifying and Testing
408
409The general process for adding new entries to the message registry is:
410
4111. Update message_registry.json to add the new errors.
4122. If a new component ID is used (usually the first byte of the SRC reason
413   code), document it in ComponentIDs.md.
4143. Validate the file. It must be valid JSON and obey the schema.  The
415   `process_registry.py` script in `extensions/openpower-pels/registry/tools`
416   will validate both, though it requires the python-jsonschema package to do
417   the schema validation.  This script is also run to validate the message
418   registry as part of CI testing.
419
420```
421 ./tools/process_registry.py -v -s schema/schema.json -r message_registry.json
422```
423
4244. One can test what PELs are generated from these new entries without writing
425   any code to create the corresponding event logs:
426    1. Copy the modified message_registry.json into `/etc/phosphor-logging/` on
427       the BMC. That directory may need to be created.
428    2. Use busctl to call the Create method to create an event log
429       corresponding to the message registry entry under test.
430
431```
432busctl call xyz.openbmc_project.Logging /xyz/openbmc_project/logging \
433xyz.openbmc_project.Logging.Create Create ssa{ss} \
434xyz.openbmc_project.Common.Error.Timeout \
435xyz.openbmc_project.Logging.Entry.Level.Error 1 "TIMEOUT_IN_MSEC" "5"
436```
437
438    3. Check the PEL that was created using peltool.
439    4. When finished, delete the file from `/etc/phosphor-logging/`.
440