15370292 | 14-May-2024 |
Arya K Padman <aryakpadman@gmail.com> |
PEL: Adding the support for Systems Key in message registry
The current implementation has the support for adding system specific callouts with the help of 'System' key in message_registry.json.
Ad
PEL: Adding the support for Systems Key in message registry
The current implementation has the support for adding system specific callouts with the help of 'System' key in message_registry.json.
Adding one more key named 'Systems' where it can have array of system names in the form of strings. The 'Systems' key can be used to define the shared callouts for a group of systems.
A unique callout to a specific system can be added using the existing System key. If both 'System' and 'Systems' are not present or not matching with the system name, then the default calloutList will be taken if configured.
Tested:
The test setup has the following names for the compatible interface.
``` busctl -j get-property xyz.openbmc_project.EntityManager /xyz/openbmc_project/inventory/system/chassis/Rainier_2U_Chassis xyz.openbmc_project.Inventory.Decorator.Compatible Names { "type" : "as", "data" : [ "com.ibm.Hardware.Chassis.Model.Rainier2U", "com.ibm.Hardware.Chassis.Model.Rainier" ] } ``` The callout section in the message_registry.json for TestError1 is defined as below. ``` "Callouts": [ { "Systems": ["com.ibm.Hardware.Chassis.Model.Rainier", "com.ibm.Hardware.Chassis.Model.Blue_Ridge"], "CalloutList": [ {"Priority": "medium", "SymbolicFRU": "service_docs"} ] }, { "System": "com.ibm.Hardware.Chassis.Model.Rainier", "CalloutList": [ {"Priority": "high", "Procedure": "BMC0001"} ] }, { "CalloutList": [ { "LocCode": "P0", "Priority": "high" }, { "LocCode": "P0-C15","Priority": "low" } ] } ] ```
Leads to PEL callouts section as below: ``` "Callout Section": { "Callout Count": "2", "Callouts": [{ "FRU Type": "Maintenance Procedure Required", "Priority": "Mandatory, replace all with this type as a unit", "Procedure": "BMC0001" }, { "FRU Type": "Symbolic FRU", "Priority": "Medium Priority", "Part Number": "SVCDOCS" }] } ```
Signed-off-by: Arya K Padman <aryakpadman@gmail.com> Change-Id: Iea65816dcb822bb07043897488a6251929548dc7
show more ...
|
041054a3 | 02-Apr-2024 |
devenrao <devenrao@in.ibm.com> |
PEL: Add a new error msg for FFDC collected after SBE chip-op success
SBE enqueues the FFDC (first failure data capture) corresponding to the internal SBE operations and internal hardware procedure
PEL: Add a new error msg for FFDC collected after SBE chip-op success
SBE enqueues the FFDC (first failure data capture) corresponding to the internal SBE operations and internal hardware procedure FFDC
After any SBE chip-op request from BMC, if the chip-op is success BMC needs to check if any FFDC is present and needs to create PEL based on the severity set in the FFDC packet.
There will be non-fatal errors when executing asynchronous operations ( e.g auto-boot, MPIPL, DMT or any periodic background operations in the SBE).
All accumulated non-fatal errors will be regularly reported back in every chip-op response.
''' root@rain71bmc:/tmp# peltool -l { "0x50006027": { "SRC": "BD204503", "Message": "SBE internal FFDC data after chipop request success", "PLID": "0x50006027", "CreatorID": "BMC", "Subsystem": "Memory", "Commit Time": "04/02/2024 12:42:14", "Sev": "Unrecoverable Error", "CompID": "bmc POZ PEL parser" }, "0x50006028": { "SRC": "BD204503", "Message": "SBE internal FFDC data after chipop request success", "PLID": "0x50006028", "CreatorID": "BMC", "Subsystem": "Memory", "Commit Time": "04/02/2024 12:42:14", "Sev": "Predictive Error", "CompID": "bmc POZ PEL parser" } } '''
Signed-off-by: Marri Devender Rao <devenrao@in.ibm.com> Change-Id: Ib488eccfca3df8c68035c78b49661749d7d9889c
show more ...
|
ba1d481c | 29-Mar-2024 |
Arya K Padman <aryakpadman@gmail.com> |
PEL: Message_registry: System name as per new compatible interface
As part of the migration from IBMCompatible interface to the new Compatible interface the PEL code is changed to use the new system
PEL: Message_registry: System name as per new compatible interface
As part of the migration from IBMCompatible interface to the new Compatible interface the PEL code is changed to use the new system name format as 'com.ibm.Hardware.Chassis.Model.<system_name>'.
The message registry file still uses the old system name format 'ibm,<system_name>' which results parsing error of PEL message registry callout JSON.
Hence modifies the message registry file to use the new system name format.
Tested: The dev_callouts file in /usr/share/phosphor-logging/pels/ have the new system name format. ``` com.ibm.Hardware.Chassis.Model.Rainier1S4U_dev_callouts.json com.ibm.Hardware.Chassis.Model.Rainier2U_dev_callouts.json com.ibm.Hardware.Chassis.Model.Rainier4U_dev_callouts.json ```
Creating a PEL for SbeTimeout which leads to the PEL callouts as expected: ``` "Callout Section": { "Callout Count": "2", "Callouts": [{ "FRU Type": "Normal Hardware FRU", "Priority": "Lowest priority replacement", "Location Code": "U78DA.ND0.WZS004A-P0", "Part Number": "02WG676", "CCIN": "2E2D", "Serial Number": "Y131UF07302J" }, { "FRU Type": "Normal Hardware FRU", "Priority": "Lowest priority replacement", "Location Code": "U78DA.ND0.WZS004A-P0-C22", "Part Number": "02WF429", "CCIN": "6B59", "Serial Number": "YL101314Y002" }] } ```
Signed-off-by: Arya K Padman <aryakpadman@gmail.com> Change-Id: I438bb3de5014edd6a510e5334c8b1981d5ac512c
show more ...
|
61b13365 | 27-Mar-2024 |
Faisal Awada <faisal@us.ibm.com> |
PEL: Fix PEL to callout PGDPART symbolic FRU
Tested: Injected a error and verified the output busctl call xyz.openbmc_project.Logging /xyz/openbmc_project/logging \ xyz.openbmc_project.Logging
PEL: Fix PEL to callout PGDPART symbolic FRU
Tested: Injected a error and verified the output busctl call xyz.openbmc_project.Logging /xyz/openbmc_project/logging \ xyz.openbmc_project.Logging.Create Create ssa{ss} \ xyz.openbmc_project.State.Shutdown.Power.Error.Regulator \ xyz.openbmc_project.Logging.Entry.Level.Critical 0
''' peltool -l { "0x5000012C": { "SRC": "11002602", "Message": "A power off was issued because a regulator for standby power faulted", "PLID": "0x5000012C", "CreatorID": "BMC", "Subsystem": "Power Control Hardware", "Commit Time": "03/28/2024 02:49:10", "Sev": "Critical Error, System Termination", "CompID": "bmc power and thermal" } } '''
peltool -i 0x5000012C { "Private Header": { "Section Version": "1", "Sub-section type": "0", "Created by": "bmc power and thermal", "Created at": "03/28/2024 02:49:10", "Committed at": "03/28/2024 02:49:10", "Creator Subsystem": "BMC", "CSSVER": "", "Platform Log Id": "0x5000012C", "Entry Id": "0x5000012C", "BMC Event Log Id": "32" }, "User Header": { "Section Version": "1", "Sub-section type": "0", "Log Committed by": "bmc error logging", "Subsystem": "Power Control Hardware", "Event Scope": "Entire Platform", "Event Severity": "Critical Error, System Termination", "Event Type": "Not Applicable", "Action Flags": [ "Service Action Required", "Report Externally", "HMC Call Home" ], "Host Transmission": "Not Sent", "HMC Transmission": "Acked" }, ''' "Primary SRC": { "Section Version": "1", "Sub-section type": "1", "Created by": "bmc power and thermal", "SRC Version": "0x02", "SRC Format": "0x55", "Virtual Progress SRC": "False", "I5/OS Service Event Bit": "False", "Hypervisor Dump Initiated":"False", "Backplane CCIN": "2E44", "Terminate FW Error": "True", "Deconfigured": "False", "Guarded": "False", "Error Details": { "Message": "A power off was issued because a regulator for standby power faulted" }, "Valid Word Count": "0x09", "Reference Code": "11002602", "Hex Word 2": "00000055", "Hex Word 3": "2E440010", "Hex Word 4": "11002602", "Hex Word 5": "20000000", "Hex Word 6": "00000000", "Hex Word 7": "00000000", "Hex Word 8": "00000000", "Hex Word 9": "00000000", "Callout Section": { "Callout Count": "1", "Callouts": [{ "FRU Type": "Symbolic FRU", "Priority": "Mandatory, replace all with this type as a unit", "Part Number": "PGDPART" }] } }, ''' "Extended User Header": { "Section Version": "1", "Sub-section type": "0", "Created by": "bmc error logging", "Reporting Machine Type": "9028-21B", "Reporting Serial Number": "788C451", "FW Released Ver": "NL1060_034", "FW SubSys Version": "fw1060.00-4.36", "Common Ref Time": "00/00/0000 00:00:00", "Symptom Id Len": "20", "Symptom Id": "11002602_2E440010" } '''
Change-Id: I87418ffc51e0a2eb98e795c53c209965b81cfcd9 Signed-off-by: Faisal Awada <faisal@us.ibm.com>
show more ...
|
4d7f943e | 07-Feb-2024 |
devenrao <devenrao@in.ibm.com> |
PEL: add new error messages for odyssey sbe related failures
There will be SBE instances for ocmb targets, so adding error messages for any SBE access failures.
Adding the new attribute "CHIP_TYPE"
PEL: add new error messages for odyssey sbe related failures
There will be SBE instances for ocmb targets, so adding error messages for any SBE access failures.
Adding the new attribute "CHIP_TYPE" as we now need to cater for colletcing SBE FFDC data both for proc and ocmb.
Now added "chip type", to the libekb_get_sbe_ffdc method, which is called in the sbe ffdc handler so that it can collect data for that specified chip type.
Tested: ''' "Private Header": { "Created by": "0xF400", }, "Primary SRC": { "Section Version": "1", "Sub-section type": "1", "Created by": "0xF400", "Error Details": { "Message": "chipop request failure reported by OCMB SBE", "SRC6": [ "0x4AA01", "[0:15] chip position, [16:23] command class, [24:31] command type" ], "CHIP_TYPE": [ "0x28", "Chip Type" ] }, "Valid Word Count": "0x09", "Reference Code": "BD20F401", }, }, ''' Signed-off-by: Marri Devender Rao <devenrao@in.ibm.com> Change-Id: I622b670d23fc1f6b0d62ad65a41ddcf89c65d319
show more ...
|