1# Platform Event Log Message Registry 2On the BMC, PELs are created from the standard event logs provided by 3phosphor-logging using a message registry that provides the PEL related fields. 4The message registry is a JSON file. 5 6## Contents 7* [Component IDs](#component-ids) 8* [Message Registry](#message-registry-fields) 9* [Modifying and Testing](#modifying-and-testing) 10 11## Component IDs 12A component ID is a 2 byte value of the form 0xYY00 used in a PEL to: 131. Provide the upper byte (the YY from above) of an SRC reason code in `BD` 14 SRCs. 152. Reside in the section header of the Private Header PEL section to specify 16 the error log creator's component ID. 173. Reside in the section header of the User Header section to specify the error 18 log committer's component ID. 194. Reside in the section header in the User Data section to specify which 20 parser to call to parse that section. 21 22Component IDs are specified in the message registry either as the upper byte of 23the SRC reason code field for `BD` SRCs, or in the standalone `ComponentID` 24field. 25 26Component IDs will be unique on a per-repository basis for errors unique to 27that repository. When the same errors are created by multiple repositories, 28those errors will all share the same component ID. The master list of 29component IDs is [here](ComponentIDs.md). 30 31## Message Registry Fields 32The message registry schema is [here](schema/schema.json), and the message 33registry itself is [here](message_registry.json). The schema will be validated 34either during a bitbake build or during CI, or eventually possibly both. 35 36In the message registry, there are fields for specifying: 37 38### Name 39This is the key into the message registry, and is the Message property 40of the OpenBMC event log that the PEL is being created from. 41 42``` 43"Name": "xyz.openbmc_project.Power.Fault" 44``` 45 46### Subsystem 47This field is part of the PEL User Header section, and is used to specify 48the subsystem pertaining to the error. It is an enumeration that maps to the 49actual PEL value. 50 51``` 52"Subsystem": "power_supply" 53``` 54 55### Severity 56This field is part of the PEL User Header section, and is used to specify 57the PEL severity. It is an optional field, if it isn't specified, then the 58severity of the OpenBMC event log will be converted into a PEL severity value. 59 60It can either be the plain severity value, or an array of severity values that 61are based on system type, where an entry without a system type will match 62anything unless another entry has a matching system type. 63 64``` 65"Severity": "unrecoverable" 66``` 67 68``` 69Severity": 70[ 71 { 72 "System": "system1", 73 "SevValue": "recovered" 74 }, 75 { 76 "Severity": "unrecoverable" 77 } 78] 79``` 80The above example shows that on system 'system1' the severity will be 81recovered, and on every other system it will be unrecoverable. 82 83### Mfg Severity 84This is an optional field and is used to override the Severity field when a 85specific manufacturing isolation mode is enabled. It has the same format as 86Severity. 87 88``` 89"MfgSeverity": "unrecoverable" 90``` 91 92### Event Scope 93This field is part of the PEL User Header section, and is used to specify 94the event scope, as defined by the PEL spec. It is optional and defaults to 95"entire platform". 96 97``` 98"EventScope": "entire_platform" 99``` 100 101### Event Type 102This field is part of the PEL User Header section, and is used to specify 103the event type, as defined by the PEL spec. It is optional and defaults to 104"not applicable" for non-informational logs, and "misc_information_only" for 105informational ones. 106 107``` 108"EventType": "na" 109``` 110 111### Action Flags 112This field is part of the PEL User Header section, and is used to specify the 113PEL action flags, as defined by the PEL spec. It is an array of enumerations. 114 115The action flags can usually be deduced from other PEL fields, such as the 116severity or if there are any callouts. As such, this is an optional field and 117if not supplied the code will fill them in based on those fields. 118 119In fact, even if supplied here, the code may still modify them to ensure they 120are correct. The rules used for this are 121[here](../README.md#action-flags-and-event-type-rules). 122 123``` 124"ActionFlags": ["service_action", "report", "call_home"] 125``` 126 127### Mfg Action Flags 128This is an optional field and is used to override the Action Flags field when a 129specific manufacturing isolation mode is enabled. 130 131``` 132"MfgActionFlags": ["service_action", "report", "call_home"] 133``` 134 135### Component ID 136This is the component ID of the PEL creator, in the form 0xYY00. For `BD` 137SRCs, this is an optional field and if not present the value will be taken from 138the upper byte of the reason code. If present for `BD` SRCs, then this byte 139must match the upper byte of the reason code. 140 141``` 142"ComponentID": "0x5500" 143``` 144 145### SRC Type 146This specifies the type of SRC to create. The type is the first 2 characters 147of the 8 character ASCII string field of the PEL. The allowed types are `BD`, 148for the standard OpenBMC error, and `11`, for power related errors. It is 149optional and if not specified will default to `BD`. 150 151Note: The ASCII string for BD SRCs looks like: `BDBBCCCC`, where: 152* BD = SRC type 153* BB = PEL subsystem as mentioned above 154* CCCC SRC reason code 155 156For `11` SRCs, it looks like: `1100RRRR`, where RRRR is the SRC reason code. 157 158``` 159"Type": "11" 160``` 161 162### SRC Reason Code 163This is the 4 character value in the latter half of the SRC ASCII string. It 164is treated as a 2 byte hex value, such as 0x5678. For `BD` SRCs, the first 165byte is the same as the first byte of the component ID field in the Private 166Header section that represents the creator's component ID. 167 168``` 169"ReasonCode": "0x5544" 170``` 171 172### SRC Symptom ID Fields 173The symptom ID is in the Extended User Header section and is defined in the PEL 174spec as the unique event signature string. It always starts with the ASCII 175string. This field in the message registry allows one to choose which SRC words 176to use in addition to the ASCII string field to form the symptom ID. All words 177are separated by underscores. If not specified, the code will choose a default 178format, which may depend on the SRC type. 179 180For example: ["SRCWord3", "SRCWord9"] would be: 181`<ASCII_STRING>_<SRCWord3>_<SRCWord9>`, which could look like: 182`B181320_00000050_49000000`. 183 184``` 185"SymptomIDFields": ["SRCWord3", "SRCWord9"] 186``` 187 188### SRC words 6 to 9 189In a PEL, these SRC words are free format and can be filled in by the user as 190desired. On the BMC, the source of these words is the AdditionalData fields in 191the event log. The message registry provides a way for the log creator to 192specify which AdditionalData property field to get the data from, and also to 193define what the SRC word means for use by parsers. If not specified, these SRC 194words will be set to zero in the PEL. 195 196``` 197"Words6to9": 198{ 199 "6": 200 { 201 "description": "Failing unit number", 202 "AdditionalDataPropSource": "PS_NUM" 203 } 204} 205``` 206 207### SRC Power Fault flag 208The SRC has a bit in it to indicate if the error is a power fault. This is an 209optional field in the message registry and defaults to false. 210 211``` 212"PowerFault: false 213``` 214 215### Documentation Fields 216The documentation fields are used by PEL parsers to display a human readable 217description of a PEL. They are also the source for the Redfish event log 218messages. 219 220#### Message 221This field is used by the BMC's PEL parser as the description of the error log. 222It will also be used in Redfish event logs. It supports argument substitution 223using the %1, %2, etc placeholders allowing any of the SRC user data words 6 - 2249 to be displayed as part of the message. If the placeholders are used, then 225the `MessageArgSources` property must be present to say which SRC words to use 226for each placeholder. 227 228``` 229"Message": "Processor %1 had %2 errors" 230``` 231 232#### MessageArgSources 233This optional field is required when the Message field contains the %X 234placeholder arguments. It is an array that says which SRC words to get the 235placeholders from. In the example below, SRC word 6 would be used for %1, and 236SRC word 7 for %2. 237 238``` 239"MessageArgSources": 240[ 241 "SRCWord6", "SRCWord7" 242] 243``` 244 245#### Description 246A short description of the error. This is required by the Redfish schema to generate a Redfish message entry, but is not used in Redfish or PEL output. 247 248``` 249"Description": "A power fault" 250``` 251 252#### Notes 253This is an optional free format text field for keeping any notes for the 254registry entry, as comments are not allowed in JSON. It is an array of strings 255for easier readability of long fields. 256 257``` 258"Notes": [ 259 "This entry is for every type of power fault.", 260 "There is probably a hardware failure." 261] 262``` 263 264### Callout Fields 265The callout fields allow one to specify the PEL callouts (either a hardware 266FRU, a symbolic FRU, or a maintenance procedure) in the entry for a particular 267error. These callouts can vary based on system type, as well as a user 268specified AdditionalData property field. Callouts will be added to the PEL in 269the order they are listed in the JSON. If a callout is passed into the error, 270say with CALLOUT_INVENTORY_PATH, then that callout will be added to the PEL 271before the callouts in the registry. 272 273There is room for up to 10 callouts in a PEL. 274 275#### Callouts example based on the system type 276 277``` 278"Callouts": 279[ 280 { 281 "System": "system1", 282 "CalloutList": 283 [ 284 { 285 "Priority": "high", 286 "LocCode": "P1-C1" 287 }, 288 { 289 "Priority": "low", 290 "LocCode": "P1" 291 } 292 ] 293 }, 294 { 295 "CalloutList": 296 [ 297 { 298 "Priority": "high", 299 "Procedure": "SVCDOCS" 300 } 301 ] 302 303 } 304] 305 306``` 307 308The above example shows that on system 'system1', the FRU at location P1-C1 309will be called out with a priority of high, and the FRU at P1 with a priority 310of low. On every other system, the maintenance procedure SVCDOCS is called 311out. 312 313#### Callouts example based on an AdditionalData field 314 315``` 316"CalloutsUsingAD": 317{ 318 "ADName": "PROC_NUM", 319 "CalloutsWithTheirADValues": 320 [ 321 { 322 "ADValue": "0", 323 "Callouts": 324 [ 325 { 326 "CalloutList": 327 [ 328 { 329 "Priority": "high", 330 "LocCode": "P1-C5" 331 } 332 ] 333 } 334 ] 335 }, 336 { 337 "ADValue": "1", 338 "Callouts": 339 [ 340 { 341 "CalloutList": 342 [ 343 { 344 "Priority": "high", 345 "LocCode": "P1-C6" 346 } 347 ] 348 } 349 ] 350 } 351 ] 352} 353 354``` 355 356This example shows that the callouts were selected based on the 'PROC_NUM' 357AdditionalData field. When PROC_NUM was 0, the FRU at P1-C5 was called out. 358When it was 1, P1-C6 was called out. Note that the same 'Callouts' array is 359used as in the previous example, so these callouts can also depend on the 360system type. 361 362#### CalloutType 363This field can be used to modify the failing component type field in the 364callout when the default doesn\'t fit: 365 366``` 367{ 368 369 "Priority": "high", 370 "Procedure": "FIXIT22" 371 "CalloutType": "config_procedure" 372} 373``` 374 375The defaults are: 376- Normal hardware FRU: hardware_fru 377- Symbolic FRU: symbolic_fru 378- Procedure: maint_procedure 379 380#### Symbolic FRU callouts with dynamic trusted location codes 381 382A special case is when one wants to use a symbolic FRU callout with a trusted 383location code, but the location code to use isn\'t known until runtime. This 384means it can\'t be specified using the 'LocCode' key in the registry. 385 386In this case, one should use the 'SymbolicFRUTrusted' key along with the 387'UseInventoryLocCode' key, and then pass in the inventory item that has the 388desired location code using the 'CALLOUT_INVENTORY_PATH' entry inside of the 389AdditionalData property. The code will then look up the location code for that 390passed in inventory FRU and place it in the symbolic FRU callout. The normal 391FRU callout with that inventory item will not be created. The symbolic FRU 392must be the first callout in the registry for this to work. 393 394``` 395{ 396 397 "Priority": "high", 398 "SymbolicFRUTrusted": "AIR_MOVR", 399 "UseInventoryLocCode": true 400} 401``` 402 403## Modifying and Testing 404 405The general process for adding new entries to the message registry is: 406 4071. Update message_registry.json to add the new errors. 4082. If a new component ID is used (usually the first byte of the SRC reason 409 code), document it in ComponentIDs.md. 4103. Validate the file. It must be valid JSON and obey the schema. The 411 `process_registry.py` script in `extensions/openpower-pels/registry/tools` 412 will validate both, though it requires the python-jsonschema package to do 413 the schema validation. This script is also run to validate the message 414 registry as part of CI testing. 415 416``` 417 ./tools/process_registry.py -v -s schema/schema.json -r message_registry.json 418``` 419 4204. One can test what PELs are generated from these new entries without writing 421 any code to create the corresponding event logs: 422 1. Copy the modified message_registry.json into `/etc/phosphor-logging/` on 423 the BMC. That directory may need to be created. 424 2. Use busctl to call the Create method to create an event log 425 corresponding to the message registry entry under test. 426 427``` 428busctl call xyz.openbmc_project.Logging /xyz/openbmc_project/logging \ 429xyz.openbmc_project.Logging.Create Create ssa{ss} \ 430xyz.openbmc_project.Common.Error.Timeout \ 431xyz.openbmc_project.Logging.Entry.Level.Error 1 "TIMEOUT_IN_MSEC" "5" 432``` 433 434 3. Check the PEL that was created using peltool. 435 4. When finished, delete the file from `/etc/phosphor-logging/`. 436