2edce4e2 | 17-Jan-2024 |
Matt Spinler <spinler@us.ibm.com> |
PEL: Use raw procedure names in PEL msg reg
Start using the raw maintenance procedure names in the PEL message registry, such as BMC0001, instead of the enum values, such as 'bmc_firmware'. This wa
PEL: Use raw procedure names in PEL msg reg
Start using the raw maintenance procedure names in the PEL message registry, such as BMC0001, instead of the enum values, such as 'bmc_firmware'. This way, the script that generates documentation can list the procedure in the section for that PEL. This was requested by the support team.
Note that there is still one user of the enums - the checkstop analysis code uses them when it creates callouts by passing in them via JSON user data files.
There are pointers added to the README and schema to find the available procedures and their descriptions.
Tested: - The unit tests still work which verify the callout contents when PEL callouts are created via the message registry. - Also manually verified PEL procedure callouts look correct when created from the message registry.
Signed-off-by: Matt Spinler <spinler@us.ibm.com> Change-Id: I3e8416956120d4dae1ff40592fd0df4f0285d072
show more ...
|
972dd4f6 | 11-Dec-2023 |
Lei YU <yulei.sh@bytedance.com> |
test: Set some cases non-parallel
Some UT cases read and write to `ERRLOG_PERSIST_PATH` dir so they will try to remove/create the dir and read/write files in the dir during the case execution. It ca
test: Set some cases non-parallel
Some UT cases read and write to `ERRLOG_PERSIST_PATH` dir so they will try to remove/create the dir and read/write files in the dir during the case execution. It causes races and the unit tests fail intermittently.
Fix the issue by setting the `is_parallel` to false for the specific cases that use the `ERRLOG_PERSIST_PATH` dir, so that these cases are run in sequence.
Tested: Verify the UT always passes for more than 300 loops.
Signed-off-by: Lei YU <yulei.sh@bytedance.com> Change-Id: I33f10ded3ca0dc11ae52f7f43849d9483ee9d202
show more ...
|
5fb575ae | 20-Oct-2023 |
Patrick Williams <patrick@stwcx.xyz> |
clang-format: copy latest and re-format
clang-format-17 has some backwards incompatible changes that require additional settings for best compatibility and re-running the formatter. Copy the latest
clang-format: copy latest and re-format
clang-format-17 has some backwards incompatible changes that require additional settings for best compatibility and re-running the formatter. Copy the latest .clang-format from the docs repository and reformat the repository.
Change-Id: Ib459ac591ed3031de84d0239948d8daa583ef8a5 Signed-off-by: Patrick Williams <patrick@stwcx.xyz>
show more ...
|
22e8695f | 09-Jun-2023 |
Ivan Mikhaylov <fr0st61te@gmail.com> |
phosphor-logging: add rsyslog configuration for TCP, UDP
Add possibility to set TCP or UDP as transport protocol for rsyslog configuration.
Depends on change of dbus interfaces in: https://gerrit.o
phosphor-logging: add rsyslog configuration for TCP, UDP
Add possibility to set TCP or UDP as transport protocol for rsyslog configuration.
Depends on change of dbus interfaces in: https://gerrit.openbmc.org/c/openbmc/phosphor-dbus-interfaces/+/64224
Tested: "@" set for UDP and "@@" for TCP in configuration file.
Change-Id: Ie849fb64b5e6cc8d87d7a984cd4d326e39aeb4ea Signed-off-by: Ivan Mikhaylov <fr0st61te@gmail.com>
show more ...
|
527ff346 | 29-Jun-2023 |
Matt Spinler <spinler@us.ibm.com> |
PEL: Handle failing to start a PLDM cmd better
A recent PLDM bug caused the registerReceiveCallback() function, which is used to setup listening for the PLDM response from the host when telling the
PEL: Handle failing to start a PLDM cmd better
A recent PLDM bug caused the registerReceiveCallback() function, which is used to setup listening for the PLDM response from the host when telling them about a new PEL, to throw an exception.
When this happened, the code got stuck in the 'in progress' state, so it would never try again when the next PEL came in.
Fix that by having startCommand() throw an exception instead of calling the failure response function callback. With this change, the code will continue on to call the cleanupCmd() function so everything is ready when the next PEL comes in.
Tested: With the bad PLDM code, after the first PEL ran out of retry attempts, created another PEL and saw the code attempt again to call PLDM. Also, wrote a new unit test case for it.
Signed-off-by: Matt Spinler <spinler@us.ibm.com> Change-Id: I38034440435d6a86e8dd880eef09499f19dd6e9c
show more ...
|
da5b76b2 | 01-Jun-2023 |
Matt Spinler <spinler@us.ibm.com> |
PEL: Support for CheckstopFlag msg reg field
Similiar to the DeconfigFlag field that was recently added, this one indicates the PEL is for a hardware checkstop and results in a bit in SRC hex word 5
PEL: Support for CheckstopFlag msg reg field
Similiar to the DeconfigFlag field that was recently added, this one indicates the PEL is for a hardware checkstop and results in a bit in SRC hex word 5 being set.
Signed-off-by: Matt Spinler <spinler@us.ibm.com> Change-Id: Ib05de7471ad3e32f48e7f20a5c611abc119fe82a
show more ...
|
81bc5611 | 01-Jun-2023 |
Matt Spinler <spinler@us.ibm.com> |
PEL: Fixes for gcc13
* Add the cstdint header file as now required to get the uint* types. * Fix a move assignment test * Refactor some nlohmann::json code to avoid:
``` /usr/include/c++/13/valarra
PEL: Fixes for gcc13
* Add the cstdint header file as now required to get the uint* types. * Fix a move assignment test * Refactor some nlohmann::json code to avoid:
``` /usr/include/c++/13/valarray:1201:1: note: template argument deduction/substitution failed: ../extensions/openpower-pels/registry.cpp:665:43: note: ‘const nlohmann::json_abi_v3_11_2::basic_json<>::value_type’ {aka ‘const nlohmann::json_abi_v3_11_2::basic_json<>’} is not derived from ‘const std::valarray<_Tp>’ 665 | (name == j["SRC"]["ReasonCode"] && type == LookupType::reasonCode)); ```
Signed-off-by: Matt Spinler <spinler@us.ibm.com> Change-Id: Ia3e733602134a60008d0d47934f95a217d2a0eb1
show more ...
|
0dd22c83 | 04-May-2023 |
Matt Spinler <spinler@us.ibm.com> |
PEL: Clear deconfig flag after callout replaced
Making use of the previous commit's framework to call a function when a fan or power supply becomes present, add code to the Manager class to register
PEL: Clear deconfig flag after callout replaced
Making use of the previous commit's framework to call a function when a fan or power supply becomes present, add code to the Manager class to register a callback that will clear the deconfig flag for all PELs created with the power-thermal or fan component ID that have the location code of the replaced fan/PS as a callout.
This way, the degraded mode reporting code will no longer pick up those PELs in its report as since the hardware was replaced those PELs are no longer relevant.
This is necessary only for fans or power supplies because they're the only N+1 hardware that can be hot plugged at runtime. And also because this is what the IBM service team wants.
Tested: Simulated missing hardware (changed present D-Bus property for fans, toggled PSU presence GPIO in the simulator for PSs). Saw errors get created for it, then simulated replacing it and saw those errors have their deconfig flag cleared, verifying before and after with peltool:
``` // Remove and replace fan phosphor-fan-monitor: Fan /system/chassis/motherboard/fan0 presence state change to false phosphor-log-manager: Created PEL 0x50000002 (BMC ID 2) with SRC 110076F1 phosphor-fan-monitor: Fan /system/chassis/motherboard/fan0 presence state change to true phosphor-log-manager: Detected FRU /xyz/openbmc_project/inventory/system/chassis/motherboard/fan0 (U78DB.ND0.1234567-A0) present phosphor-log-manager: Clearing deconfig flag in PEL 0x50000002 with SRC 110076F1 because U78DB.ND0.1234567-A0 was replaced
// Remove and replace PS phosphor-log-manager: Created PEL 0x50000003 (BMC ID 3) with SRC 110015F6 ... phosphor-psu-monitor: Updating inventory present property. present:true invpath:/system/chassis/motherboard/powersupply0 name:powersupply0 phosphor-log-manager: Detected FRU /xyz/openbmc_project/inventory/system/chassis/motherboard/powersupply0 (U78DB.ND0.1234567-E0) present phosphor-log-manager: Clearing deconfig flag in PEL 0x50000003 with SRC 110015F6 because U78DB.ND0.1234567-E0 was replaced ```
Signed-off-by: Matt Spinler <spinler@us.ibm.com> Change-Id: Iee05b4a612ca8f438f8c89f37b4e7b529a131a9f
show more ...
|
784b02e7 | 25-Apr-2023 |
Matt Spinler <spinler@us.ibm.com> |
PEL:test: Refactor mocked checkDumpStatus usage
When creating an instance of the SRC class in a testcase, it needs a filled in mock of the checkDumpStatus() function. Instead of manually doing that
PEL:test: Refactor mocked checkDumpStatus usage
When creating an instance of the SRC class in a testcase, it needs a filled in mock of the checkDumpStatus() function. Instead of manually doing that everywhere a PEL or SRC class is created, just do it in the constructor of the mock DataInterface class.
Signed-off-by: Matt Spinler <spinler@us.ibm.com> Change-Id: I74790c67251465aae87d318ea37891d9eabab5e5
show more ...
|
32e36b8c | 25-Apr-2023 |
Matt Spinler <spinler@us.ibm.com> |
PEL:pel_manager_test: Refactor temp dir cleanup
Several testcases were creating a temporary directory and then removing it at the end, except for one which missed the cleanup so the dir would stick
PEL:pel_manager_test: Refactor temp dir cleanup
Several testcases were creating a temporary directory and then removing it at the end, except for one which missed the cleanup so the dir would stick around in /tmp after the run was done.
To fix that and to make sure it doesn't happen again, just add the creating and deleting of the temp directory to the test fixture class so the deleting happens automatically.
Signed-off-by: Matt Spinler <spinler@us.ibm.com> Change-Id: I2abb720b4c0aeb9dec6117d6399c3e4692709c68
show more ...
|
0bf04b5d | 28-Apr-2023 |
Matt Spinler <spinler@us.ibm.com> |
PEL: Convert a BCDTime to epoch milliseconds
Add a new function to convert a BCDTime value from a PEL to the number of milliseconds since the epoch.
This will be used to put the PEL creation timest
PEL: Convert a BCDTime to epoch milliseconds
Add a new function to convert a BCDTime value from a PEL to the number of milliseconds since the epoch.
This will be used to put the PEL creation timestamp on D-Bus.
Signed-off-by: Matt Spinler <spinler@us.ibm.com> Change-Id: Ic470c9324e959b3e6a7eea29db4162a45dd54fc3
show more ...
|
7410260a | 25-Apr-2023 |
Matt Spinler <spinler@us.ibm.com> |
PEL:log_id_test: Remove file when testcase done
A testcase was missing a delete of a temporary file it created. Delete it so it isn't left around.
Signed-off-by: Matt Spinler <spinler@us.ibm.com>
PEL:log_id_test: Remove file when testcase done
A testcase was missing a delete of a temporary file it created. Delete it so it isn't left around.
Signed-off-by: Matt Spinler <spinler@us.ibm.com> Change-Id: I4666241374ee7b6e260210539e7e29b34d018600
show more ...
|
ac1ba3f2 | 10-May-2023 |
Patrick Williams <patrick@stwcx.xyz> |
clang-format: copy latest and re-format
clang-format-16 has some backwards incompatible changes that require additional settings for best compatibility and re-running the formatter. Copy the latest
clang-format: copy latest and re-format
clang-format-16 has some backwards incompatible changes that require additional settings for best compatibility and re-running the formatter. Copy the latest .clang-format from the docs repository and reformat the repository.
Change-Id: I077deb6e98025e4e8c6abd4d039f9af4db19342b Signed-off-by: Patrick Williams <patrick@stwcx.xyz>
show more ...
|
3fe93e96 | 14-Apr-2023 |
Matt Spinler <spinler@us.ibm.com> |
PEL: Support new DeconfigFlag msg reg field
Add a new field to the PEL message registry to indicate that some hardware associated with a callout was deconfigured.
"DeconfigFlag": true
This results
PEL: Support new DeconfigFlag msg reg field
Add a new field to the PEL message registry to indicate that some hardware associated with a callout was deconfigured.
"DeconfigFlag": true
This results in bit 6 in the SRC's hex word 5 being set. It is looked at by the code that creates the periodic PELs when there is a system running with deconfigured hardware.
Note: For something to be considered deconfigured, it doesn't necessarily mean that firmware chose to not configure it. It could also mean that it is broken or missing, at least in the case of power supplies or fans.
Signed-off-by: Matt Spinler <spinler@us.ibm.com> Change-Id: I3e657cb52416fc50000e299bf2d583db7b67caf8
show more ...
|
b832aa5e | 21-Mar-2023 |
Matt Spinler <spinler@us.ibm.com> |
PEL: Print component names in peltool
Every PEL section has a 2 byte component ID field in its header that peltool prints. Currently, it just prints the hex version, like "0x1000".
There are JSON
PEL: Print component names in peltool
Every PEL section has a 2 byte component ID field in its header that peltool prints. Currently, it just prints the hex version, like "0x1000".
There are JSON files in the flash already that contain mappings of component IDs to to component names, and this commit starts looking up the component names from those files and using those in the peltool output.
An example of a file is: /usr/share/phosphor-logging/pels/O_component_ids.json: { "1000": "bmc common function", "2000": "bmc error logging", ... }
Where the 'O' in the filename is the creator ID field of the PEL. There is also a file for hostboot, which is B_component_ids.json.
Also, for PELs with a PHYP creator ID, just convert the ID to two characters, like 0x4552 - > "ER" as that is what they are.
peltool output examples: "Created by": "bmc error logging", "Created by": "hostboot: errl", "Created by": "IO",
This matches what is already done by the python peltool.
Signed-off-by: Matt Spinler <spinler@us.ibm.com> Change-Id: Id616739e1b7ca67c85dc7efa85dc34acf6aca9b5
show more ...
|
bad056be | 25-Jan-2023 |
Matt Spinler <spinler@us.ibm.com> |
PEL: Handle multiple inv paths per loc code
DataInterface::getInventoryFromLocCode() was only returning a single inventory path from GetFRUsByExpandedLocationCode() even though multiple paths may ha
PEL: Handle multiple inv paths per loc code
DataInterface::getInventoryFromLocCode() was only returning a single inventory path from GetFRUsByExpandedLocationCode() even though multiple paths may have been returned.
Mostly that was fine, except when a processor on a DCM was called out. That would lead to only one processor on the DCM being set to not functional by service_indicators.cpp, so on the web UI the actual CPU called out may not have been marked as unhealthy (health status critical in Redfish).
This commit changes getInventoryFromLocCode() to return all the paths that GetFRUsByExpandedLocationCode() returns, and then makes the corresponding changes in service_indicators.cpp to be able to handle multiple inventory paths per location code when setting them to not functional and creating a critical association.
The other code that was calling this function can just use the first path returned, since in those cases it's just needed to get the VPD information for the PEL, and all the paths would return the same info anyway since they had the same location code.
Signed-off-by: Matt Spinler <spinler@us.ibm.com> Change-Id: Ia16f50881e4a4f84c171ae20b7a99eddcc98ad4f
show more ...
|
9d921096 | 15-Dec-2022 |
Matt Spinler <spinler@us.ibm.com> |
PEL: Capture the journal in UserData sections
If a PEL message registry entry has a 'JournalCapture' section, capture the listed portions of the journal in UserData sections for that error.
If the
PEL: Capture the journal in UserData sections
If a PEL message registry entry has a 'JournalCapture' section, capture the listed portions of the journal in UserData sections for that error.
If the JSON looks like:
"JournalCapture": { "NumLines": 30 }
Then the code will capture the previous 30 lines from the journal into a single UserData section.
If the JSON looks like:
"JournalCapture": { "Sections": [ { "SyslogID": "phosphor-bmc-state-manager", "NumLines": 20 }, { "SyslogID": "phosphor-log-manager", "NumLines": 15 } ] }
Then the code will create two UserData sections, the first with the most recent 20 lines from phosphor-bmc-state-manager, and the second with 15 lines from phosphor-log-manager.
If a section would cause the PEL to exceed its maximum size of 16KB, it will be dropped. While the UserData class does have a shrink() method, it prunes data from the end, which would cause the most recent journal entries to be removed, which could be misleading.
Signed-off-by: Matt Spinler <spinler@us.ibm.com> Change-Id: I2ecbd8002b0e7087eb166a1219c6ab9da14a122a
show more ...
|
711f1129 | 15-Dec-2022 |
Matt Spinler <spinler@us.ibm.com> |
PEL: Read the journal info from the registry
Add support to the Registry class to read the JournalCapture information out of the registry entry for an error if it is there. This will then be used b
PEL: Read the journal info from the registry
Add support to the Registry class to read the JournalCapture information out of the registry entry for an error if it is there. This will then be used by the PEL class when creating a PEL.
Signed-off-by: Matt Spinler <spinler@us.ibm.com> Change-Id: I529e4fb352c6eb33ad73b2a04ac4467710865139
show more ...
|
d96fa60d | 15-Dec-2022 |
Matt Spinler <spinler@us.ibm.com> |
PEL: Create class to read from the journal
Create a Journal class that can extract messages out of the journal and return them as a vector of strings that look like:
"Dec 14 15:58:17 systemd[1]: sy
PEL: Create class to read from the journal
Create a Journal class that can extract messages out of the journal and return them as a vector of strings that look like:
"Dec 14 15:58:17 systemd[1]: systemd-tmpfiles-clean.service: Deactivated successfully."
It can either grab the previous N entries, or the previous N entries that match a specific SYSLOG_IDENTIFIER value.
The class follows the same strategy as the DataInterface class where a base class pointer is passed into the PEL Manager class so that during unit test it can be mocked.
Future commits will capture the journal into PEL UserData sections.
Signed-off-by: Matt Spinler <spinler@us.ibm.com> Change-Id: I9f4bb304c4b213165049fa00de2e62f962ae67f1
show more ...
|
875b6c7b | 20-Oct-2021 |
Vijay Lobo <vijaylobo@gmail.com> |
PEL: Add boot progress code to SRC hex data
Add the first 8 characters from the ASCII string field of the current progress SRC, taken from the xyz.openbmc_project.State.Boot.Raw D-Bus interface, to
PEL: Add boot progress code to SRC hex data
Add the first 8 characters from the ASCII string field of the current progress SRC, taken from the xyz.openbmc_project.State.Boot.Raw D-Bus interface, to SRC hex word 4 when creating a PEL.
This is how the field is defined in the PEL spec, and is to help with debug so that one can know which part of the boot was occurring when the PEL was created. Note that at this point most progress codes are sent down from one of the host firmware subsystems and not created by the BMC.
The field is only inserted into the SRC if those characters are present and represent a valid 4 byte number, such as "C7004000". This is then represented as 0xC7004000 in the SRC word. Otherwise, the word is left at a value of zero.
For example: ... "Valid Word Count": "0x09", "Reference Code": "BD8D1001", "Hex Word 2": "00080455", "Hex Word 3": "2E2D0010", "Hex Word 4": "C7004000", <---Progress code "Hex Word 5": "00000000", "Hex Word 6": "00000005", "Hex Word 7": "00000000", "Hex Word 8": "00000000", "Hex Word 9": "00000000" ...
Signed-off-by: Vijay Lobo <vijaylobo@gmail.com> Signed-off-by: Matt Spinler <spinler@us.ibm.com> Change-Id: Iba41e88626c0e081e5759b994e3630ef8b12daf4
show more ...
|
2544b419 | 04-Oct-2022 |
Patrick Williams <patrick@stwcx.xyz> |
clang-format: update with latest
Signed-off-by: Patrick Williams <patrick@stwcx.xyz> Change-Id: I329396457b83bb2eb8740629b4ac1fbe9106bced |
3d923311 | 01-Aug-2022 |
Matt Spinler <spinler@us.ibm.com> |
PEL: Add CalloutsWhenNoADMatch msg reg support
There is a new use case where we need to do one callout in the message registry based on the value of an AdditionalData field value, and another callou
PEL: Add CalloutsWhenNoADMatch msg reg support
There is a new use case where we need to do one callout in the message registry based on the value of an AdditionalData field value, and another callout in all other cases.
To support this, this commit is adding a new 'CalloutsWhenNoADMatch' field in the PEL message registry that allows one to add callouts when there is no match on the 'ADValue' field. This behaves like an 'else' leg to the 'if AdValue == X' structure in the message registry.
Example: { "ADName": "PROC_NUM", "CalloutsWithTheirADValues": [ { "ADValue": "0", "Callouts": [ // callouts when PROC_NUM == 0 ] }, { "ADValue": "1", "Callouts": [ // callouts when PROC_NUM == 1 ] } ], "CalloutsWhenNoADMatch": [ { // callouts when PROC_NUM != 0 or 1 } ] }
Signed-off-by: Matt Spinler <spinler@us.ibm.com> Change-Id: Ib8e208ff950a643302e856c7dd2b7474fec61b26
show more ...
|
4fd14219 | 20-Jul-2022 |
Matt Spinler <spinler@us.ibm.com> |
PEL: Add missing gtest dependency
Fixes a compile failure when building the PEL tests in standalone mode (using subprojects).
Signed-off-by: Matt Spinler <spinler@us.ibm.com> Change-Id: Ida0abbd824
PEL: Add missing gtest dependency
Fixes a compile failure when building the PEL tests in standalone mode (using subprojects).
Signed-off-by: Matt Spinler <spinler@us.ibm.com> Change-Id: Ida0abbd824b6573129a3b1262b7b490270ce7787
show more ...
|
be952d2e | 01-Jul-2022 |
Matt Spinler <spinler@us.ibm.com> |
PEL: Fix more cppcheck warnings
This is the second of two commits to fix most of the cppcheck warnings in the PEL code. It doesn't fix all of them because some are false positives and some are just
PEL: Fix more cppcheck warnings
This is the second of two commits to fix most of the cppcheck warnings in the PEL code. It doesn't fix all of them because some are false positives and some are just suggestions.
It's broken up into two commits to make them smaller.
Signed-off-by: Matt Spinler <spinler@us.ibm.com> Change-Id: Id9f462386df85fd25d09529d6b410115ff4ccba8
show more ...
|
4a375950 | 01-Jul-2022 |
Matt Spinler <spinler@us.ibm.com> |
Fix some cppcheck warnings
Fix some of the warnings in the base phosphor-logging code.
It doesn't fix all of them - some are false positives and some are just suggestions, but it does bring down th
Fix some cppcheck warnings
Fix some of the warnings in the base phosphor-logging code.
It doesn't fix all of them - some are false positives and some are just suggestions, but it does bring down the number a bit to make it easier to spot real issues.
Signed-off-by: Matt Spinler <spinler@us.ibm.com> Change-Id: I8a131cbc2941ccba79ef3775151007c770207bd5
show more ...
|