#
37abe9be |
| 31-Oct-2024 |
Chris Cain <cjcain@us.ibm.com> |
Update occ-control to use lg2 for all logging
Convert existing log<level>() trace statements to lg2::level()
Testing: Verified on Rainier - captured journal traces before and after commit during bo
Update occ-control to use lg2 for all logging
Convert existing log<level>() trace statements to lg2::level()
Testing: Verified on Rainier - captured journal traces before and after commit during boots, mode, pcap and ips changes.
Change-Id: I318fa7bf3902c641b0c28b09190db4b61d0a2fa9 Signed-off-by: Chris Cain <cjcain@us.ibm.com>
show more ...
|
#
f0295f52 |
| 12-Sep-2024 |
Chris Cain <cjcain@us.ibm.com> |
Improve BMC error handling for OCC comm failures
- Delay starting OCC reset until all OCCs have been detected (or timeout). It will prevent multiple resets from being triggered and to help detecting
Improve BMC error handling for OCC comm failures
- Delay starting OCC reset until all OCCs have been detected (or timeout). It will prevent multiple resets from being triggered and to help detecting when reset is completed (active sensor being set after reset is complete) - Wait for PLDM response to OCC reset and HRESET requests and retry if they fail - If HRESET returns NOT_READY, collect SBE FFDC and try OCC reset. A persistent failure will put the system in safe state.
- Prevent overwriting dvfs over-temp filename for p10 and beyond since that old file is only present in old kernel - Prevent assert when opening sysfs files. (added catch and then created an OCC Comm failure PEL, which will force an OCC reset.) - Check return code after reading sysfs files to confirm success. If read fails, try reset to recover.
- Updated traces to include which processor/OCC encountered issues. - Better recovery to close windows that were leaving system in partial good state.
JIRA: PFES-66 Change-Id: I0b087d0e05bd8562682062e1c662f9e18164a720 Signed-off-by: Chris Cain <cjcain@us.ibm.com>
show more ...
|
#
48002498 |
| 13-Feb-2024 |
Patrick Williams <patrick@stwcx.xyz> |
prefer std::format over fmt
Switch to std::format to remove the dependency on fmt.
Signed-off-by: Patrick Williams <patrick@stwcx.xyz> Change-Id: Id3a1295ba8a90fb756cfc500892dcc5b3235e27b
|
#
d8aab2a9 |
| 21-Apr-2023 |
Patrick Williams <patrick@stwcx.xyz> |
elog-errors: remove generated file
Moved the YAML to phosphor-dbus-interfaces directly and removed the generated files here.
Signed-off-by: Patrick Williams <patrick@stwcx.xyz> Change-Id: I76cf0fb8
elog-errors: remove generated file
Moved the YAML to phosphor-dbus-interfaces directly and removed the generated files here.
Signed-off-by: Patrick Williams <patrick@stwcx.xyz> Change-Id: I76cf0fb8d72e91b52ba4b1647ea6b712509a9abd
show more ...
|
#
9789e71f |
| 25-May-2022 |
Eddie James <eajames@linux.ibm.com> |
Log a PEL for communication, presence mismatch, and safe state errors
Add code to log a PEL in various error scenarios. Refactor some of the error handling to get the return code out of the driver.
Log a PEL for communication, presence mismatch, and safe state errors
Add code to log a PEL in various error scenarios. Refactor some of the error handling to get the return code out of the driver.
Signed-off-by: Eddie James <eajames@linux.ibm.com> Change-Id: Ifd91cfc063718e484ec8886df8357d115c6b41e3
show more ...
|
#
d0345ae0 |
| 10-Sep-2021 |
George Liu <liuxiwei@inspur.com> |
Comment unused parameters Signed-off-by: George Liu <liuxiwei@inspur.com> Change-Id: Id5448711b5a6910514264ff9c008b7d32c26a8f8
|
#
a8857c50 |
| 27-Jan-2021 |
Chris Cain <cjcain@us.ibm.com> |
Trigger periodic OCC POLL commands when the OCCs are running The OCC control app will periodically trigger kernel poll commands to the OCC when the OCCs are active. Code change also
Trigger periodic OCC POLL commands when the OCCs are running The OCC control app will periodically trigger kernel poll commands to the OCC when the OCCs are active. Code change also adds an interface to allow any OCC command to be sent to an OCC. The pass-through interface was also updated to use the new command object. Tested: I did several IPLs on multiple Rainier systems to verify the changes. I forced OCC resets to ensure polling stopped when OCCs were disabled and restarted after it came out of reset. Change-Id: I56970e781a988bb94f17ac38173ace8a68bb5fad Signed-off-by: Chris Cain <cjcain@us.ibm.com>
show more ...
|
#
774f9af9 |
| 19-Mar-2019 |
Eddie James <eajames@us.ibm.com> |
Fix error attribute naming for Linux 5.0 There was a slight change to one of the error attributes as part of the OCC driver upstreaming process. This commit also adds unit tests for the
Fix error attribute naming for Linux 5.0 There was a slight change to one of the error attributes as part of the OCC driver upstreaming process. This commit also adds unit tests for the error attributes. This required some refactoring to support the unit tests. Resolves openbmc/openbmc#3505 Signed-off-by: Eddie James <eajames@us.ibm.com> Change-Id: I665b46e44b18befc8a728f7246bcda82f1f1a71c
show more ...
|
#
94df8c90 |
| 14-Sep-2018 |
Gunnar Mills <gmills@us.ibm.com> |
clang-format: Update to match docs repo Update the .clang-format file and run clang-format-6.0. This .clang-format matches the example one in https://github.com/openbmc/docs/blob/mas
clang-format: Update to match docs repo Update the .clang-format file and run clang-format-6.0. This .clang-format matches the example one in https://github.com/openbmc/docs/blob/master/cpp-style-and-conventions.md#clang-formatting Change-Id: Id6760866dedbaeafd83ea8ef2e0303e30b8955aa Signed-off-by: Gunnar Mills <gmills@us.ibm.com> Signed-off-by: Brad Bishop <bradleyb@fuzziesquirrel.com>
show more ...
|
#
1111087e |
| 17-Jan-2018 |
Andrew Geissler <geissonator@yahoo.com> |
Retry OCC read operations OCC communication can get intermittent failures. Retry all reads 3 times before declaring a failure and exiting out of the openpower-occ-control application
Retry OCC read operations OCC communication can get intermittent failures. Retry all reads 3 times before declaring a failure and exiting out of the openpower-occ-control application. Resolves openbmc/openbmc#2805 Change-Id: I34dca5bc3c19a1f88975b427bdb6a683b41dbcb7 Signed-off-by: Andrew Geissler <geissonator@yahoo.com>
show more ...
|
#
482e31ff |
| 14-Sep-2017 |
Eddie James <eajames@us.ibm.com> |
Add watches for throttling reported by the OCC Add Error objects to watch the sysfs entries provided by the OCC hwmon driver that report various types of throttling. Also needed to add a
Add watches for throttling reported by the OCC Add Error objects to watch the sysfs entries provided by the OCC hwmon driver that report various types of throttling. Also needed to add a boolean input to the Error callback interface. Resolves openbmc/openbmc#1821 Change-Id: I4425770a92ace0f73024b3dc4c577ce46957a62a Signed-off-by: Eddie James <eajames@us.ibm.com>
show more ...
|
#
636577f4 |
| 06-Oct-2017 |
Edward A. James <eajames@us.ibm.com> |
Add OCC present count detection and watch Add a Presence child class of Error to handle detecting the number of OCCs available. Add an instance of this Presence class if the Device d
Add OCC present count detection and watch Add a Presence child class of Error to handle detecting the number of OCCs available. Add an instance of this Presence class if the Device detects that it is the master OCC, since the number of present OCCs is only reported by the master OCC. When a change to the number of OCCs reported is detected, compare with the number of OCCs determined to be active by the Manager, and if there is a mismatch, follow the usual error path (reset OCC, etc). Partially resolves openbmc/openbmc#2285 See https://gerrit.openbmc-project.xyz/#/c/7843/ Change-Id: Idbaca52b307992d9b01fe15439ab746ef6d64397 Signed-off-by: Edward A. James <eajames@us.ibm.com>
show more ...
|
#
2c129132 |
| 29-Aug-2017 |
Vishwanatha Subbanna <vishwa@linux.vnet.ibm.com> |
Seek to file offset 0 after reading error file For a fd watching for data in sysfs file, a poll would return immediately after read unless file is closed and re-opened OR stream is s
Seek to file offset 0 after reading error file For a fd watching for data in sysfs file, a poll would return immediately after read unless file is closed and re-opened OR stream is seeked to 0. This commit chooses latter. From stackoverflow: Once poll/select indicates that the value has changed, you need to close and re-open the file, or seek to 0 and read again. Also, use EPOLLPRI | EPOLLERR than EPOLLIN as needed by sysfs_inotify Change-Id: I243cdfd9a09c567eac5e52abd9980ebf90b94f89 Signed-off-by: Vishwanatha Subbanna <vishwa@linux.vnet.ibm.com>
show more ...
|
#
2dc9b1a2 |
| 18-Aug-2017 |
Vishwanatha Subbanna <vishwa@linux.vnet.ibm.com> |
Enable OCC error monitoring Fixes openbmc/openbmc#2165 Change-Id: I93f317a32c910f279003fa0fce6ae2d597f90312 Signed-off-by: Vishwanatha Subbanna <vishwa@linux.vnet.ibm.com>
|
#
ee4d83df |
| 29-Jun-2017 |
Vishwanatha Subbanna <vishwa@linux.vnet.ibm.com> |
Add support to watch for OCC errors Change-Id: I98d95020a2d01e281e5c8efa825d6b4bd4c6c160 Signed-off-by: Vishwanatha Subbanna <vishwa@linux.vnet.ibm.com>
|