1# Platform Event Log Message Registry 2On the BMC, PELs are created from the standard event logs provided by 3phosphor-logging using a message registry that provides the PEL related fields. 4The message registry is a JSON file. 5 6## Contents 7* [Component IDs](#component-ids) 8* [Message Registry](#message-registry-fields) 9* [Modifying and Testing](#modifying-and-testing) 10 11## Component IDs 12A component ID is a 2 byte value of the form 0xYY00 used in a PEL to: 131. Provide the upper byte (the YY from above) of an SRC reason code in `BD` 14 SRCs. 152. Reside in the section header of the Private Header PEL section to specify 16 the error log creator's component ID. 173. Reside in the section header of the User Header section to specify the error 18 log committer's component ID. 194. Reside in the section header in the User Data section to specify which 20 parser to call to parse that section. 21 22Component IDs are specified in the message registry either as the upper byte of 23the SRC reason code field for `BD` SRCs, or in the standalone `ComponentID` 24field. 25 26Component IDs will be unique on a per-repository basis for errors unique to 27that repository. When the same errors are created by multiple repositories, 28those errors will all share the same component ID. The master list of 29component IDs is [here](O_component_ids.json). That file can used by PEL 30parsers to display a name for the component ID. The 'O' in the name is the 31creator ID value for BMC created PELs. 32 33## Message Registry Fields 34The message registry schema is [here](schema/schema.json), and the message 35registry itself is [here](message_registry.json). The schema will be validated 36either during a bitbake build or during CI, or eventually possibly both. 37 38In the message registry, there are fields for specifying: 39 40### Name 41This is the key into the message registry, and is the Message property 42of the OpenBMC event log that the PEL is being created from. 43 44``` 45"Name": "xyz.openbmc_project.Power.Fault" 46``` 47 48### Subsystem 49This field is part of the PEL User Header section, and is used to specify 50the subsystem pertaining to the error. It is an enumeration that maps to the 51actual PEL value. If the subsystem isn't known ahead of time, it can be passed 52in at the time of PEL creation using the 'PEL\_SUBSYSTEM' AdditionalData field. 53In this case, 'Subsystem' isn't required, though 'PossibleSubsystems' is. 54 55``` 56"Subsystem": "power_supply" 57``` 58 59### PossibleSubsystems 60This field is used by scripts that build documentation from the message 61registry to know which subsystems are possible for an error when it can't be 62hardcoded using the 'Subsystem' field. It is mutually exclusive with the 63'Subsystem' field. 64 65``` 66"PossibleSubsystems": ["memory", "processor"] 67``` 68 69### Severity 70This field is part of the PEL User Header section, and is used to specify 71the PEL severity. It is an optional field, if it isn't specified, then the 72severity of the OpenBMC event log will be converted into a PEL severity value. 73 74It can either be the plain severity value, or an array of severity values that 75are based on system type, where an entry without a system type will match 76anything unless another entry has a matching system type. 77 78``` 79"Severity": "unrecoverable" 80``` 81 82``` 83Severity": 84[ 85 { 86 "System": "system1", 87 "SevValue": "recovered" 88 }, 89 { 90 "Severity": "unrecoverable" 91 } 92] 93``` 94The above example shows that on system 'system1' the severity will be 95recovered, and on every other system it will be unrecoverable. 96 97### Mfg Severity 98This is an optional field and is used to override the Severity field when a 99specific manufacturing isolation mode is enabled. It has the same format as 100Severity. 101 102``` 103"MfgSeverity": "unrecoverable" 104``` 105 106### Event Scope 107This field is part of the PEL User Header section, and is used to specify 108the event scope, as defined by the PEL spec. It is optional and defaults to 109"entire platform". 110 111``` 112"EventScope": "entire_platform" 113``` 114 115### Event Type 116This field is part of the PEL User Header section, and is used to specify 117the event type, as defined by the PEL spec. It is optional and defaults to 118"not applicable" for non-informational logs, and "misc_information_only" for 119informational ones. 120 121``` 122"EventType": "na" 123``` 124 125### Action Flags 126This field is part of the PEL User Header section, and is used to specify the 127PEL action flags, as defined by the PEL spec. It is an array of enumerations. 128 129The action flags can usually be deduced from other PEL fields, such as the 130severity or if there are any callouts. As such, this is an optional field and 131if not supplied the code will fill them in based on those fields. 132 133In fact, even if supplied here, the code may still modify them to ensure they 134are correct. The rules used for this are 135[here](../README.md#action-flags-and-event-type-rules). 136 137``` 138"ActionFlags": ["service_action", "report", "call_home"] 139``` 140 141### Mfg Action Flags 142This is an optional field and is used to override the Action Flags field when a 143specific manufacturing isolation mode is enabled. 144 145``` 146"MfgActionFlags": ["service_action", "report", "call_home"] 147``` 148 149### Component ID 150This is the component ID of the PEL creator, in the form 0xYY00. For `BD` 151SRCs, this is an optional field and if not present the value will be taken from 152the upper byte of the reason code. If present for `BD` SRCs, then this byte 153must match the upper byte of the reason code. 154 155``` 156"ComponentID": "0x5500" 157``` 158 159### SRC Type 160This specifies the type of SRC to create. The type is the first 2 characters 161of the 8 character ASCII string field of the PEL. The allowed types are `BD`, 162for the standard OpenBMC error, and `11`, for power related errors. It is 163optional and if not specified will default to `BD`. 164 165Note: The ASCII string for BD SRCs looks like: `BDBBCCCC`, where: 166* BD = SRC type 167* BB = PEL subsystem as mentioned above 168* CCCC SRC reason code 169 170For `11` SRCs, it looks like: `1100RRRR`, where RRRR is the SRC reason code. 171 172``` 173"Type": "11" 174``` 175 176### SRC Reason Code 177This is the 4 character value in the latter half of the SRC ASCII string. It 178is treated as a 2 byte hex value, such as 0x5678. For `BD` SRCs, the first 179byte is the same as the first byte of the component ID field in the Private 180Header section that represents the creator's component ID. 181 182``` 183"ReasonCode": "0x5544" 184``` 185 186### SRC Symptom ID Fields 187The symptom ID is in the Extended User Header section and is defined in the PEL 188spec as the unique event signature string. It always starts with the ASCII 189string. This field in the message registry allows one to choose which SRC words 190to use in addition to the ASCII string field to form the symptom ID. All words 191are separated by underscores. If not specified, the code will choose a default 192format, which may depend on the SRC type. 193 194For example: ["SRCWord3", "SRCWord9"] would be: 195`<ASCII_STRING>_<SRCWord3>_<SRCWord9>`, which could look like: 196`B181320_00000050_49000000`. 197 198``` 199"SymptomIDFields": ["SRCWord3", "SRCWord9"] 200``` 201 202### SRC words 6 to 9 203In a PEL, these SRC words are free format and can be filled in by the user as 204desired. On the BMC, the source of these words is the AdditionalData fields in 205the event log. The message registry provides a way for the log creator to 206specify which AdditionalData property field to get the data from, and also to 207define what the SRC word means for use by parsers. If not specified, these SRC 208words will be set to zero in the PEL. 209 210``` 211"Words6to9": 212{ 213 "6": 214 { 215 "description": "Failing unit number", 216 "AdditionalDataPropSource": "PS_NUM" 217 } 218} 219``` 220 221### Documentation Fields 222The documentation fields are used by PEL parsers to display a human readable 223description of a PEL. They are also the source for the Redfish event log 224messages. 225 226#### Message 227This field is used by the BMC's PEL parser as the description of the error log. 228It will also be used in Redfish event logs. It supports argument substitution 229using the %1, %2, etc placeholders allowing any of the SRC user data words 6 - 2309 to be displayed as part of the message. If the placeholders are used, then 231the `MessageArgSources` property must be present to say which SRC words to use 232for each placeholder. 233 234``` 235"Message": "Processor %1 had %2 errors" 236``` 237 238#### MessageArgSources 239This optional field is required when the Message field contains the %X 240placeholder arguments. It is an array that says which SRC words to get the 241placeholders from. In the example below, SRC word 6 would be used for %1, and 242SRC word 7 for %2. 243 244``` 245"MessageArgSources": 246[ 247 "SRCWord6", "SRCWord7" 248] 249``` 250 251#### Description 252A short description of the error. This is required by the Redfish schema to generate a Redfish message entry, but is not used in Redfish or PEL output. 253 254``` 255"Description": "A power fault" 256``` 257 258#### Notes 259This is an optional free format text field for keeping any notes for the 260registry entry, as comments are not allowed in JSON. It is an array of strings 261for easier readability of long fields. 262 263``` 264"Notes": [ 265 "This entry is for every type of power fault.", 266 "There is probably a hardware failure." 267] 268``` 269 270### Callout Fields 271The callout fields allow one to specify the PEL callouts (either a hardware 272FRU, a symbolic FRU, or a maintenance procedure) in the entry for a particular 273error. These callouts can vary based on system type, as well as a user 274specified AdditionalData property field. Callouts will be added to the PEL in 275the order they are listed in the JSON. If a callout is passed into the error, 276say with CALLOUT_INVENTORY_PATH, then that callout will be added to the PEL 277before the callouts in the registry. 278 279There is room for up to 10 callouts in a PEL. 280 281#### Callouts example based on the system type 282 283``` 284"Callouts": 285[ 286 { 287 "System": "system1", 288 "CalloutList": 289 [ 290 { 291 "Priority": "high", 292 "LocCode": "P1-C1" 293 }, 294 { 295 "Priority": "low", 296 "LocCode": "P1" 297 } 298 ] 299 }, 300 { 301 "CalloutList": 302 [ 303 { 304 "Priority": "high", 305 "Procedure": "SVCDOCS" 306 } 307 ] 308 309 } 310] 311 312``` 313 314The above example shows that on system 'system1', the FRU at location P1-C1 315will be called out with a priority of high, and the FRU at P1 with a priority 316of low. On every other system, the maintenance procedure SVCDOCS is called 317out. 318 319#### Callouts example based on an AdditionalData field 320 321``` 322"CalloutsUsingAD": 323{ 324 "ADName": "PROC_NUM", 325 "CalloutsWithTheirADValues": 326 [ 327 { 328 "ADValue": "0", 329 "Callouts": 330 [ 331 { 332 "CalloutList": 333 [ 334 { 335 "Priority": "high", 336 "LocCode": "P1-C5" 337 } 338 ] 339 } 340 ] 341 }, 342 { 343 "ADValue": "1", 344 "Callouts": 345 [ 346 { 347 "CalloutList": 348 [ 349 { 350 "Priority": "high", 351 "LocCode": "P1-C6" 352 } 353 ] 354 } 355 ] 356 } 357 ] 358} 359 360``` 361 362This example shows that the callouts were selected based on the 'PROC_NUM' 363AdditionalData field. When PROC_NUM was 0, the FRU at P1-C5 was called out. 364When it was 1, P1-C6 was called out. Note that the same 'Callouts' array is 365used as in the previous example, so these callouts can also depend on the 366system type. 367 368#### CalloutType 369This field can be used to modify the failing component type field in the 370callout when the default doesn\'t fit: 371 372``` 373{ 374 375 "Priority": "high", 376 "Procedure": "FIXIT22" 377 "CalloutType": "config_procedure" 378} 379``` 380 381The defaults are: 382- Normal hardware FRU: hardware_fru 383- Symbolic FRU: symbolic_fru 384- Procedure: maint_procedure 385 386#### Symbolic FRU callouts with dynamic trusted location codes 387 388A special case is when one wants to use a symbolic FRU callout with a trusted 389location code, but the location code to use isn\'t known until runtime. This 390means it can\'t be specified using the 'LocCode' key in the registry. 391 392In this case, one should use the 'SymbolicFRUTrusted' key along with the 393'UseInventoryLocCode' key, and then pass in the inventory item that has the 394desired location code using the 'CALLOUT_INVENTORY_PATH' entry inside of the 395AdditionalData property. The code will then look up the location code for that 396passed in inventory FRU and place it in the symbolic FRU callout. The normal 397FRU callout with that inventory item will not be created. The symbolic FRU 398must be the first callout in the registry for this to work. 399 400``` 401{ 402 403 "Priority": "high", 404 "SymbolicFRUTrusted": "AIR_MOVR", 405 "UseInventoryLocCode": true 406} 407``` 408 409## Modifying and Testing 410 411The general process for adding new entries to the message registry is: 412 4131. Update message_registry.json to add the new errors. 4142. If a new component ID is used (usually the first byte of the SRC reason 415 code), document it in O_component_ids.json. 4163. Validate the file. It must be valid JSON and obey the schema. The 417 `process_registry.py` script in `extensions/openpower-pels/registry/tools` 418 will validate both, though it requires the python-jsonschema package to do 419 the schema validation. This script is also run to validate the message 420 registry as part of CI testing. 421 422``` 423 ./tools/process_registry.py -v -s schema/schema.json -r message_registry.json 424``` 425 4264. One can test what PELs are generated from these new entries without writing 427 any code to create the corresponding event logs: 428 1. Copy the modified message_registry.json into `/etc/phosphor-logging/` on 429 the BMC. That directory may need to be created. 430 2. Use busctl to call the Create method to create an event log 431 corresponding to the message registry entry under test. 432 433``` 434busctl call xyz.openbmc_project.Logging /xyz/openbmc_project/logging \ 435xyz.openbmc_project.Logging.Create Create ssa{ss} \ 436xyz.openbmc_project.Common.Error.Timeout \ 437xyz.openbmc_project.Logging.Entry.Level.Error 1 "TIMEOUT_IN_MSEC" "5" 438``` 439 440 3. Check the PEL that was created using peltool. 441 4. When finished, delete the file from `/etc/phosphor-logging/`. 442