#
9972716f |
| 23-Sep-2024 |
harsh-agarwal1 <harsh.agarwal@ibm.com> |
PEL: Updated PEL spec error action flag bit#10
- This will help to indicate which SRCs need to be collected and regularly called home. - Defined new entry in message registry schema action flags fie
PEL: Updated PEL spec error action flag bit#10
- This will help to indicate which SRCs need to be collected and regularly called home. - Defined new entry in message registry schema action flags field - Set this action flag in the registry for the daily clock PEL and the voltage regulator N+1 phase fault PEL
Tested: Sample output: ```bash $ peltool -af ... "Action Flags": [ "Report Externally", "Heartbeat Call Home Event" ... "Action Flags": [ "Report Externally", "HMC Call Home", "Heartbeat Call Home Event" ... $ peltool -afx ... 00000030 55 48 00 18 01 00 20 00 60 03 00 01 00 00 00 00 | 00000040 00 00 20 20 00 00 00 00 50 53 00 50 01 01 27 00 | ... 00000030 55 48 00 18 01 00 20 00 58 03 00 01 00 00 00 00 | 00000040 00 00 28 20 00 00 00 00 50 53 00 50 01 01 30 00 | ...
``` Change-Id: Ifa181374692013dfadc0dd744f6db61baad3d42e Signed-off-by: Harsh Agarwal <Harsh.Agarwal@ibm.com>
show more ...
|
#
075c7923 |
| 16-Aug-2024 |
Patrick Williams <patrick@stwcx.xyz> |
clang-format: re-format for clang-18
clang-format-18 isn't compatible with the clang-format-17 output, so we need to reformat the code with the latest version. The way clang-18 handles lambda forma
clang-format: re-format for clang-18
clang-format-18 isn't compatible with the clang-format-17 output, so we need to reformat the code with the latest version. The way clang-18 handles lambda formatting also changed, so we have made changes to the organization default style format to better handle lambda formatting.
See I5e08687e696dd240402a2780158664b7113def0e for updated style. See Iea0776aaa7edd483fa395e23de25ebf5a6288f71 for clang-18 enablement.
Change-Id: I21d2ca8065f24fd73509229c517f5caf48934b60 Signed-off-by: Patrick Williams <patrick@stwcx.xyz>
show more ...
|
#
ac1ba3f2 |
| 10-May-2023 |
Patrick Williams <patrick@stwcx.xyz> |
clang-format: copy latest and re-format
clang-format-16 has some backwards incompatible changes that require additional settings for best compatibility and re-running the formatter. Copy the latest
clang-format: copy latest and re-format
clang-format-16 has some backwards incompatible changes that require additional settings for best compatibility and re-running the formatter. Copy the latest .clang-format from the docs repository and reformat the repository.
Change-Id: I077deb6e98025e4e8c6abd4d039f9af4db19342b Signed-off-by: Patrick Williams <patrick@stwcx.xyz>
show more ...
|
#
385446ba |
| 14-Apr-2023 |
Marri Devender Rao <devenrao@in.ibm.com> |
PEL: Add new error message for system health state
A new error message added that will be created when there are guard records And/Or Deconfigured records And/Or unresolved pels with deconfig bit se
PEL: Add new error message for system health state
A new error message added that will be created when there are guard records And/Or Deconfigured records And/Or unresolved pels with deconfig bit set.
HMC will subscribe to this error SRC and when it gets notified it requests for NAG dump which will capture serviceable records in a JSON file and attach it to the dump.
This error message acts as a reminder for servicing faulty hardware pending service present in the system.
Tested: "0x500F05A1": { "SRC": "BD50F138", "PLID": "0x500F05A1", "CreatorID": "BMC", "Subsystem": "CEC Hardware", "Commit Time": "04/24/2023 07:47:45", "Sev": "Predictive Error", "CompID": "bmc faultlog" }
"User Header": { "Section Version": "1", "Sub-section type": "0", "Log Committed by": "bmc error logging", "Subsystem": "CEC Hardware", "Event Scope": "Entire Platform", "Event Severity": "Predictive Error", "Event Type": "Not Applicable", "Action Flags": [ "Service Action Required", "Report Externally", "HMC Call Home" ], "Host Transmission": "Not Sent", "HMC Transmission": "Not Sent" },
"User Data 1": { "Section Version": "1", "Sub-section type": "1", "Created by": "bmc error logging", "DECONFIG_RECORD_COUNT": "0", "GUARD_WITH_ASSOC_ERROR_COUNT": "1", "GUARD_WITH_NO_ASSOC_ERROR_COUNT": "0", "UNRESOLVED_PEL_WITH_DECONFIG_BIT_COUNT": "0" }
"Primary SRC": { "Section Version": "1", "Sub-section type": "1", "Created by": "bmc faultlog", "Callout Section": { "Callout Count": "1", "Callouts": [{ "FRU Type": "Maintenance Procedure Required", "Priority": "Mandatory, replace all with this type as a unit", "Procedure": "BMC0008" }] } } Signed-off-by: Marri Devender Rao <devenrao@in.ibm.com> Change-Id: I1a046d4416719e479378d7032f32663fc07fd03a
show more ...
|
#
70235fc8 |
| 03-Jan-2023 |
Matt Spinler <spinler@us.ibm.com> |
PEL: Fix trailing space in subsystem name
The 'Memory' PEL subsystem description had a trailing space. Remove it so peltool doesn't display it.
Signed-off-by: Matt Spinler <spinler@us.ibm.com> Cha
PEL: Fix trailing space in subsystem name
The 'Memory' PEL subsystem description had a trailing space. Remove it so peltool doesn't display it.
Signed-off-by: Matt Spinler <spinler@us.ibm.com> Change-Id: I0e38cac855a314205cc7b1bb64d3c8177ce82578
show more ...
|
#
134e1e15 |
| 02-Dec-2022 |
Priyanga Ramasamy <priyanga24@in.ibm.com> |
PEL:Procedure to correct system VPD issues
This commit adds a new callout procedure "BMC0007" to correct * system backplane VPD mismatch between hardware value and backed up value. * system bac
PEL:Procedure to correct system VPD issues
This commit adds a new callout procedure "BMC0007" to correct * system backplane VPD mismatch between hardware value and backed up value. * system backplane data being blank on both hardware and in backup.
Test: Tested that newly added callout procedure has been picked when there is a system VPD mismatch and when the data is blank.
peltool -i 0x500024AB { "Private Header": { "Section Version": "1", "Sub-section type": "0", "Created by": "0x4000", "Created at": "12/05/2022 05:58:47", "Committed at": "12/05/2022 05:58:47", "Creator Subsystem": "BMC", "CSSVER": "", "Platform Log Id": "0x500024AB", "Entry Id": "0x500024AB", "BMC Event Log Id": "314" }, "User Header": { "Section Version": "1", "Sub-section type": "0", "Log Committed by": "0x2000", "Subsystem": "CEC Hardware - VPD Interface", "Event Scope": "Entire Platform", "Event Severity": "Predictive Error", "Event Type": "Not Applicable", "Action Flags": [ "Service Action Required", "Report Externally", "HMC Call Home" ], "Host Transmission": "Not Sent", "HMC Transmission": "Not Sent" }, "Primary SRC": { "Section Version": "1", "Sub-section type": "1", "Created by": "0x4000", "SRC Version": "0x02", "SRC Format": "0x55", "Virtual Progress SRC": "False", "I5/OS Service Event Bit": "False", "Hypervisor Dump Initiated":"False", "Backplane CCIN": "2E2D", "Terminate FW Error": "False", "Deconfigured": "False", "Guarded": "False", "Error Details": { "Message": "A system VPD restoration error" "occurred." }, "Valid Word Count": "0x09", "Reference Code": "BD554008", "Hex Word 2": "00080055", "Hex Word 3": "2E2D0010", "Hex Word 4": "00000000", "Hex Word 5": "00000000", "Hex Word 6": "00000000", "Hex Word 7": "00000000", "Hex Word 8": "00000000", "Hex Word 9": "00000000", "Callout Section": { "Callout Count": "2", "Callouts": [{ "FRU Type": "Normal Hardware FRU", "Priority": "Mandatory, replace all with this" "type as a unit", "Location Code": "U78DA.ND0.WZS007H-P0", "Part Number": "02WG676", "CCIN": "2E2D", "Serial Number": "YF33UF19Y007" }, { "FRU Type": "Maintenance Procedure Required", "Priority": "Mandatory, replace all with this" "type as a unit", "Procedure": "BMC0007" }] } }, "Extended User Header": { "Section Version": "1", "Sub-section type": "0", "Created by": "0x2000", "Reporting Machine Type": "9105-22B", "Reporting Serial Number": "1392BE0", "FW Released Ver": "", "FW SubSys Version": "fw1040.00-1.2", "Common Ref Time": "00/00/0000 00:00:00", "Symptom Id Len": "20", "Symptom Id": "BD554008_2E2D0010" }, "Failing MTMS": { "Section Version": "1", "Sub-section type": "0", "Created by": "0x2000", "Machine Type Model": "9105-22B", "Serial Number": "1392BE0" }, "User Data 0": { "Section Version": "1", "Sub-section type": "1", "Created by": "0x2000", "BMCLoad": "0.22 0.41 0.33", "BMCState": "Ready", "BMCUptime": "0y 0d 0h 13m 6s", "BootState": "Unspecified", "ChassisState": "Off", "FW Version ID": "fw1040.00-1.2-2-g1859836242-dirty", "HostState": "Off", "System IM": "50001001" }, "User Data 1": { "Section Version": "1", "Sub-section type": "1", "Created by": "0x2000", "CALLOUT_INVENTORY_PATH": "/xyz/openbmc_project/inventory/system" "/chassis/motherboard", "DESCRIPTION": "VPD data mismatch on cache and hardware for" "record: LXR0 and keyword: LX", "Value on Cache: ": "0x31 0x0 0x4 0x1 0x0 0x30 0x0 0x71 ", "Value read from EEPROM: ": "0x64 0x65 0x4 0x1 0x0 0x30 0x0 0x71 " } }
Signed-off-by: Priyanga Ramasamy <priyanga24@in.ibm.com> Change-Id: I738542d55781538285cb3747461929195cbd6d48
show more ...
|
#
9b779e42 |
| 16-Dec-2022 |
Jim Wright <jlwright@us.ibm.com> |
PEL: Add additional power sequencer callouts
Facilitate future code development by adding additional power good and voltage fault callouts.
Signed-off-by: Jim Wright <jlwright@us.ibm.com> Change-Id
PEL: Add additional power sequencer callouts
Facilitate future code development by adding additional power good and voltage fault callouts.
Signed-off-by: Jim Wright <jlwright@us.ibm.com> Change-Id: Ie56a1dc74bc6b52e44d719a00e59ffcd168ddc07
show more ...
|
#
2544b419 |
| 04-Oct-2022 |
Patrick Williams <patrick@stwcx.xyz> |
clang-format: update with latest
Signed-off-by: Patrick Williams <patrick@stwcx.xyz> Change-Id: I329396457b83bb2eb8740629b4ac1fbe9106bced
|
#
c13bb1c5 |
| 01-Aug-2022 |
Matt Spinler <spinler@us.ibm.com> |
PEL: Add OVERTMP symbolic FRU callout
This commit adds the OVERTMP symbolic FRU as a high callout to the perfloss, warning, and critical high temperature PELs. It uses the newly introduced Callouts
PEL: Add OVERTMP symbolic FRU callout
This commit adds the OVERTMP symbolic FRU as a high callout to the perfloss, warning, and critical high temperature PELs. It uses the newly introduced CalloutsWhenNoADMatch message registry field to handle doing it for all sensors that aren't the ambient one.
Signed-off-by: Matt Spinler <spinler@us.ibm.com> Change-Id: I9775daefbf67063dc5bcde2a0ad763eef3e437a5
show more ...
|
#
ace33dbe |
| 19-Apr-2022 |
Jim Wright <jlwright@us.ibm.com> |
PEL: Correct callouts for Everest pgood failures
Make necessary corrections to the callouts for pgood related failures on the Everest system.
Signed-off-by: Jim Wright <jlwright@us.ibm.com> Change-
PEL: Correct callouts for Everest pgood failures
Make necessary corrections to the callouts for pgood related failures on the Everest system.
Signed-off-by: Jim Wright <jlwright@us.ibm.com> Change-Id: Iea42d2bde52e61f9977d860b850917e1ebd08691
show more ...
|
#
ef8e79c6 |
| 10-Mar-2022 |
Zane Shelley <zshelle@us.ibm.com> |
PEL: Add service procedure to find SUE root cause
An SUE indicates a UE has occurred in another area of hardware. This procedure instructs service to look for previous PEL for the root cause of the
PEL: Add service procedure to find SUE root cause
An SUE indicates a UE has occurred in another area of hardware. This procedure instructs service to look for previous PEL for the root cause of the SUE attention.
Signed-off-by: Zane Shelley <zshelle@us.ibm.com> Change-Id: Id9b7e6c65a06e52587ac40464456be42b9a557b1
show more ...
|
#
1ba9ef15 |
| 15-Nov-2021 |
Brandon Wyman <bjwyman@gmail.com> |
PEL: Add isolation procedure for power OC
Add in an isolation procedure for an over-current power problem.
Change-Id: I5da1f3e3e9078f4f52fd78469f1fb8948f497e1b Signed-off-by: Brandon Wyman <bjwyman
PEL: Add isolation procedure for power OC
Add in an isolation procedure for an over-current power problem.
Change-Id: I5da1f3e3e9078f4f52fd78469f1fb8948f497e1b Signed-off-by: Brandon Wyman <bjwyman@gmail.com>
show more ...
|
#
16f5495b |
| 15-Nov-2021 |
Jayanth Othayoth <ojayanth@in.ibm.com> |
PEL: Add fsi path maintenance procedure
Maintenance procedures are part of the PEL callout section. This procedure is used to indicate the problem could be related to the fsi hardware path.
Signed-
PEL: Add fsi path maintenance procedure
Maintenance procedures are part of the PEL callout section. This procedure is used to indicate the problem could be related to the fsi hardware path.
Signed-off-by: Jayanth Othayoth <ojayanth@in.ibm.com> Change-Id: I36ca9f0484e041adcaa542b9ea17b71ab545c6bd
show more ...
|
#
52dbaba5 |
| 14-Nov-2021 |
Jayanth Othayoth <ojayanth@in.ibm.com> |
PEL: Add SBE code maintenance procedure
Maintenance procedures are part of the PEL callout section. This procedure is used to indicate the problem is due to code (a bug) SBE image.
Signed-off-by:
PEL: Add SBE code maintenance procedure
Maintenance procedures are part of the PEL callout section. This procedure is used to indicate the problem is due to code (a bug) SBE image.
Signed-off-by: Jayanth Othayoth <ojayanth@in.ibm.com> Change-Id: I8e1a04b2c8aa83c178990331f181ed026ac62db9
show more ...
|
#
f060bf0d |
| 14-Nov-2021 |
Jayanth Othayoth <ojayanth@in.ibm.com> |
PEL: Add next level of support maintenance procedure
Maintenance procedures are part of the PEL callout section. This procedure is indicates isolation not possible, contact your next level of suppor
PEL: Add next level of support maintenance procedure
Maintenance procedures are part of the PEL callout section. This procedure is indicates isolation not possible, contact your next level of support.
Signed-off-by: Jayanth Othayoth <ojayanth@in.ibm.com> Change-Id: If1d481604e411947e47cb7c9c20d79e938c65db3
show more ...
|
#
242be740 |
| 16-Sep-2021 |
Vijay Lobo <vijaylobo@gmail.com> |
PEL: Update pel_values to support bmcweb
Updates done to support display of Message property in bmcweb. With the redfish command the message property showed up as --old- "Message": "xyz.openbmc_proj
PEL: Update pel_values to support bmcweb
Updates done to support display of Message property in bmcweb. With the redfish command the message property showed up as --old- "Message": "xyz.openbmc_project.Power.PowerSupply.Error.InputFault" --New-- "Message": "110015F0 event in subsystem: Power Supply",
Note that the "Power Supply" is a $subsystem property and had ':' in the description, to enhance readability the $subsystem for CEC Hardware is update with '-' in place of ':'.
Change-Id: Ic6af0937f4e560753814462692b95185a6d84d07 Signed-off-by: Vijay Lobo <vijaylobo@gmail.com>
show more ...
|
#
b63f75b7 |
| 18-Aug-2021 |
Rashmica Gupta <rashmica.g@gmail.com> |
PEL: Add altitude PEL registry entries
These will only be used on systems that use PELs and have the virtual altitude sensor. The errors come from the sensor-monitor app.
Signed-off-by: Rashmica Gu
PEL: Add altitude PEL registry entries
These will only be used on systems that use PELs and have the virtual altitude sensor. The errors come from the sensor-monitor app.
Signed-off-by: Rashmica Gupta <rashmica.g@gmail.com> Change-Id: I5fec3ec86a084f8c62f52194f2826c0565be32eb
show more ...
|
#
ea2873dd |
| 18-Aug-2021 |
Matt Spinler <spinler@us.ibm.com> |
PEL: Change BMCSP01 maint procedure to BMC0001
This name change allows 4 characters of uniqueness instead of just 2.
Signed-off-by: Matt Spinler <spinler@us.ibm.com> Change-Id: I12cb883619a3a3c2f84
PEL: Change BMCSP01 maint procedure to BMC0001
This name change allows 4 characters of uniqueness instead of just 2.
Signed-off-by: Matt Spinler <spinler@us.ibm.com> Change-Id: I12cb883619a3a3c2f843dd7ccc3db6524dea3459
show more ...
|
#
479b6927 |
| 17-Aug-2021 |
Matt Spinler <spinler@us.ibm.com> |
PEL: Remove 'no_vpd_for_fru' maintenance procedure
This procedure was being used in a callout when the location code could not be found on an inventory item. The code was changed to just not add a
PEL: Remove 'no_vpd_for_fru' maintenance procedure
This procedure was being used in a callout when the location code could not be found on an inventory item. The code was changed to just not add a callout in this case.
This was done for the following reasons: 1) There's no expected reason that the inventory would be missing a location code for a valid inventory path. 2) There wasn't a way to give a hint to the end user about what the callout should be in that case. Usually maintenance procedures should have steps one can take to still do the appropriate replacement. 3) Because of 1), this case indicates a bad inventory path passed in by the caller. Theoretically there is a way to have a 'bmc_code' callout, but that wouldn't be appropriate to add in this log since the intent of this one is for another problem. 3a) There is no way to create a new error log from inside a PEL section, and I don't think this one case warrants coming up with a way to do so. 4) The inventory path in question is still being added into a UserData section so that development can debug it.
Signed-off-by: Matt Spinler <spinler@us.ibm.com> Change-Id: I0757080f04942807cc34029d8667387db6b959fe
show more ...
|
#
593a4c66 |
| 16-Jun-2021 |
Vijay Lobo <vijaylobo@gmail.com> |
PEL: Support resolution property
Support resolution property to add callouts
Tested: Created new PEL using busctl and checked for the property to see if the value is updated. The error log daemon w
PEL: Support resolution property
Support resolution property to add callouts
Tested: Created new PEL using busctl and checked for the property to see if the value is updated. The error log daemon was restarted and the property is checked again to make sure the value is restored.
Verified serialization of new error log by recreating it on a old code version to make sure logs are created w/o the new property and the daemon did not crash.
Result: root@rainier:~# busctl get-property xyz.openbmc_project.Logging \ /xyz/openbmc_project/logging/entry/1 xyz.openbmc_project.Logging.\ Entry Resolution s "1. Priority: High, Procedure: BMCSP02\n2. Priority: Medium, PN: \ SVCDOCS\n"
Test with location Code: root@p10bmc:~# busctl get-property xyz.openbmc_project.Logging \ /xyz/openbmc_project/logging/entry/3 xyz.openbmc_project.Logging.Entry \ Resolution s "1. Location Code: U78DA.ND0.1234567-P0, Priority: Medium, PN: SVCDOCS\n2. \ Priority: Low, Procedure: BMCSP02\n"
Signed-off-by: Vijay Lobo <vijaylobo@gmail.com> Change-Id: I44eebbf794efeb8e752fff98de7c638c927982cd
show more ...
|
#
dde99203 |
| 14-Jun-2021 |
Matt Spinler <spinler@us.ibm.com> |
PEL: Add FanControllerOffline message reg entry
This error is posted when the fan monitor application cannot find any fan sensors on D-Bus, so it doesn't know the fan states when a power on is attem
PEL: Add FanControllerOffline message reg entry
This error is posted when the fan monitor application cannot find any fan sensors on D-Bus, so it doesn't know the fan states when a power on is attempted. It will shut down the system.
The P0-T12 and P0-T17 callouts are connectors for the fan signal cable. These callouts follow the rules of using a symbolic FRU to identify the cable for the first endpoint and a 'CBLCONT' symbolic FRU for the other one.
Signed-off-by: Matt Spinler <spinler@us.ibm.com> Change-Id: I99eadacf7c9f3a5a1037c9ddbaeeb4f5e11c08e8
show more ...
|
#
54ff54e9 |
| 14-Mar-2021 |
Jay Meyer <jaymeyer@us.ibm.com> |
PEL: Add message registry entry for power loss.
Add PEL message entry for power loss.
Tested: Power on machine. Build and install phosphor-log-manager and restart its service. Delete all existing P
PEL: Add message registry entry for power loss.
Add PEL message entry for power loss.
Tested: Power on machine. Build and install phosphor-log-manager and restart its service. Delete all existing PEL entries. Install patched message registry into /etc/phosphor-logging. Use the APC GUI for the ePDU to turn off power to all power supplies. Turn power back on, get the PEL entry which shows the blackout entry.
Signed-off-by: Jay Meyer <jaymeyer@us.ibm.com> Change-Id: Ic7b06c3db7d6356ac19dc46062ee8abc4f9c8723
show more ...
|
#
6b3490ac |
| 20-Jan-2021 |
Matt Spinler <spinler@us.ibm.com> |
PEL: Add ambient temp PEL registry entries
Add 3 new PEL message registry entries: * temp crossed EPOW3 threshold * temp crossed EPOW4 threshold * temp under EPOW3 threshold
These error logs will c
PEL: Add ambient temp PEL registry entries
Add 3 new PEL message registry entries: * temp crossed EPOW3 threshold * temp crossed EPOW4 threshold * temp under EPOW3 threshold
These error logs will come from the sensor-monitor application that monitors the HardShutdown (EPOW4) and SoftShutdown (EPOW3) D-Bus threshod interfaces. These interfaces will only ever be used on the ambient virtual sensor in the systems that use PELs.
There is no entry for going below the EPOW4 threshold because, while ther sensor-monitor code would create an error for that, the interface will have the hysteresis set so that it will never happen.
Signed-off-by: Matt Spinler <spinler@us.ibm.com> Change-Id: Ieed5364654a80697e189cf2e7614b0aead1fb218
show more ...
|
#
7f678905 |
| 30-Nov-2020 |
Jim Wright <jlwright@us.ibm.com> |
PEL: Add power-sequencer registry entries
Make changes necessary to create PEL entries for errors issued by the power-sequencer application.
Signed-off-by: Jim Wright <jlwright@us.ibm.com> Change-I
PEL: Add power-sequencer registry entries
Make changes necessary to create PEL entries for errors issued by the power-sequencer application.
Signed-off-by: Jim Wright <jlwright@us.ibm.com> Change-Id: I36f9caaa44e3fb9e6116c59e267eadc406116d58
show more ...
|
#
60d1c398 |
| 23-Sep-2020 |
Matt Spinler <spinler@us.ibm.com> |
PEL: Add fan missing PEL registry entry
This error is created by phosphor-fan-presence-tach after a fan has been unplugged for a certain amount of time.
It calls out the backplane as low. In the f
PEL: Add fan missing PEL registry entry
This error is created by phosphor-fan-presence-tach after a fan has been unplugged for a certain amount of time.
It calls out the backplane as low. In the future, this may be dependent on system/chassis type but that property is not available on D-bus yet.
Signed-off-by: Matt Spinler <spinler@us.ibm.com> Change-Id: I28738cb81d41f2f804f942d4a44f1aa125231b5e
show more ...
|