History log of /openbmc/openpower-occ-control/occ_errors.cpp (Results 1 – 15 of 15)
Revision Date Author Comments
# 37abe9be 31-Oct-2024 Chris Cain <cjcain@us.ibm.com>

Update occ-control to use lg2 for all logging

Convert existing log<level>() trace statements to lg2::level()

Testing: Verified on Rainier - captured journal traces before and after
commit during bo

Update occ-control to use lg2 for all logging

Convert existing log<level>() trace statements to lg2::level()

Testing: Verified on Rainier - captured journal traces before and after
commit during boots, mode, pcap and ips changes.

Change-Id: I318fa7bf3902c641b0c28b09190db4b61d0a2fa9
Signed-off-by: Chris Cain <cjcain@us.ibm.com>

show more ...


# f0295f52 12-Sep-2024 Chris Cain <cjcain@us.ibm.com>

Improve BMC error handling for OCC comm failures

- Delay starting OCC reset until all OCCs have been detected (or
timeout). It will prevent multiple resets from being triggered and to
help detecting

Improve BMC error handling for OCC comm failures

- Delay starting OCC reset until all OCCs have been detected (or
timeout). It will prevent multiple resets from being triggered and to
help detecting when reset is completed (active sensor being set after
reset is complete)
- Wait for PLDM response to OCC reset and HRESET requests and retry if
they fail
- If HRESET returns NOT_READY, collect SBE FFDC and try OCC reset. A
persistent failure will put the system in safe state.

- Prevent overwriting dvfs over-temp filename for p10 and beyond since
that old file is only present in old kernel
- Prevent assert when opening sysfs files. (added catch and then created
an OCC Comm failure PEL, which will force an OCC reset.)
- Check return code after reading sysfs files to confirm success. If
read fails, try reset to recover.

- Updated traces to include which processor/OCC encountered issues.
- Better recovery to close windows that were leaving system in partial
good state.

JIRA: PFES-66
Change-Id: I0b087d0e05bd8562682062e1c662f9e18164a720
Signed-off-by: Chris Cain <cjcain@us.ibm.com>

show more ...


# 48002498 13-Feb-2024 Patrick Williams <patrick@stwcx.xyz>

prefer std::format over fmt

Switch to std::format to remove the dependency on fmt.

Signed-off-by: Patrick Williams <patrick@stwcx.xyz>
Change-Id: Id3a1295ba8a90fb756cfc500892dcc5b3235e27b


# d8aab2a9 21-Apr-2023 Patrick Williams <patrick@stwcx.xyz>

elog-errors: remove generated file

Moved the YAML to phosphor-dbus-interfaces directly and removed
the generated files here.

Signed-off-by: Patrick Williams <patrick@stwcx.xyz>
Change-Id: I76cf0fb8

elog-errors: remove generated file

Moved the YAML to phosphor-dbus-interfaces directly and removed
the generated files here.

Signed-off-by: Patrick Williams <patrick@stwcx.xyz>
Change-Id: I76cf0fb8d72e91b52ba4b1647ea6b712509a9abd

show more ...


# 9789e71f 25-May-2022 Eddie James <eajames@linux.ibm.com>

Log a PEL for communication, presence mismatch, and safe state errors

Add code to log a PEL in various error scenarios. Refactor some of the
error handling to get the return code out of the driver.

Log a PEL for communication, presence mismatch, and safe state errors

Add code to log a PEL in various error scenarios. Refactor some of the
error handling to get the return code out of the driver.

Signed-off-by: Eddie James <eajames@linux.ibm.com>
Change-Id: Ifd91cfc063718e484ec8886df8357d115c6b41e3

show more ...


# d0345ae0 10-Sep-2021 George Liu <liuxiwei@inspur.com>

Comment unused parameters

Signed-off-by: George Liu <liuxiwei@inspur.com>
Change-Id: Id5448711b5a6910514264ff9c008b7d32c26a8f8


# a8857c50 27-Jan-2021 Chris Cain <cjcain@us.ibm.com>

Trigger periodic OCC POLL commands when the OCCs are running

The OCC control app will periodically trigger kernel poll commands
to the OCC when the OCCs are active.
Code change also

Trigger periodic OCC POLL commands when the OCCs are running

The OCC control app will periodically trigger kernel poll commands
to the OCC when the OCCs are active.
Code change also adds an interface to allow any OCC command to be
sent to an OCC. The pass-through interface was also updated to
use the new command object.

Tested: I did several IPLs on multiple Rainier systems to verify
the changes. I forced OCC resets to ensure polling stopped when
OCCs were disabled and restarted after it came out of reset.

Change-Id: I56970e781a988bb94f17ac38173ace8a68bb5fad
Signed-off-by: Chris Cain <cjcain@us.ibm.com>

show more ...


# 774f9af9 19-Mar-2019 Eddie James <eajames@us.ibm.com>

Fix error attribute naming for Linux 5.0

There was a slight change to one of the error attributes as part of the
OCC driver upstreaming process. This commit also adds unit tests for the

Fix error attribute naming for Linux 5.0

There was a slight change to one of the error attributes as part of the
OCC driver upstreaming process. This commit also adds unit tests for the
error attributes. This required some refactoring to support the unit
tests.

Resolves openbmc/openbmc#3505

Signed-off-by: Eddie James <eajames@us.ibm.com>
Change-Id: I665b46e44b18befc8a728f7246bcda82f1f1a71c

show more ...


# 94df8c90 14-Sep-2018 Gunnar Mills <gmills@us.ibm.com>

clang-format: Update to match docs repo

Update the .clang-format file and run clang-format-6.0.
This .clang-format matches the example one in
https://github.com/openbmc/docs/blob/mas

clang-format: Update to match docs repo

Update the .clang-format file and run clang-format-6.0.
This .clang-format matches the example one in
https://github.com/openbmc/docs/blob/master/cpp-style-and-conventions.md#clang-formatting

Change-Id: Id6760866dedbaeafd83ea8ef2e0303e30b8955aa
Signed-off-by: Gunnar Mills <gmills@us.ibm.com>
Signed-off-by: Brad Bishop <bradleyb@fuzziesquirrel.com>

show more ...


# 1111087e 17-Jan-2018 Andrew Geissler <geissonator@yahoo.com>

Retry OCC read operations

OCC communication can get intermittent failures. Retry
all reads 3 times before declaring a failure and exiting
out of the openpower-occ-control application

Retry OCC read operations

OCC communication can get intermittent failures. Retry
all reads 3 times before declaring a failure and exiting
out of the openpower-occ-control application.

Resolves openbmc/openbmc#2805

Change-Id: I34dca5bc3c19a1f88975b427bdb6a683b41dbcb7
Signed-off-by: Andrew Geissler <geissonator@yahoo.com>

show more ...


# 482e31ff 14-Sep-2017 Eddie James <eajames@us.ibm.com>

Add watches for throttling reported by the OCC

Add Error objects to watch the sysfs entries provided by the OCC hwmon
driver that report various types of throttling. Also needed to add a

Add watches for throttling reported by the OCC

Add Error objects to watch the sysfs entries provided by the OCC hwmon
driver that report various types of throttling. Also needed to add a
boolean input to the Error callback interface.

Resolves openbmc/openbmc#1821

Change-Id: I4425770a92ace0f73024b3dc4c577ce46957a62a
Signed-off-by: Eddie James <eajames@us.ibm.com>

show more ...


# 636577f4 06-Oct-2017 Edward A. James <eajames@us.ibm.com>

Add OCC present count detection and watch

Add a Presence child class of Error to handle detecting the number of
OCCs available. Add an instance of this Presence class if the Device
d

Add OCC present count detection and watch

Add a Presence child class of Error to handle detecting the number of
OCCs available. Add an instance of this Presence class if the Device
detects that it is the master OCC, since the number of present OCCs is
only reported by the master OCC. When a change to the number of OCCs
reported is detected, compare with the number of OCCs determined to be
active by the Manager, and if there is a mismatch, follow the usual
error path (reset OCC, etc).

Partially resolves openbmc/openbmc#2285
See https://gerrit.openbmc-project.xyz/#/c/7843/

Change-Id: Idbaca52b307992d9b01fe15439ab746ef6d64397
Signed-off-by: Edward A. James <eajames@us.ibm.com>

show more ...


# 2c129132 29-Aug-2017 Vishwanatha Subbanna <vishwa@linux.vnet.ibm.com>

Seek to file offset 0 after reading error file

For a fd watching for data in sysfs file, a poll would return
immediately after read unless file is closed and re-opened OR
stream is s

Seek to file offset 0 after reading error file

For a fd watching for data in sysfs file, a poll would return
immediately after read unless file is closed and re-opened OR
stream is seeked to 0. This commit chooses latter.

From stackoverflow:

Once poll/select indicates that the value has changed, you need to close and
re-open the file, or seek to 0 and read again.

Also, use EPOLLPRI | EPOLLERR than EPOLLIN as needed by sysfs_inotify

Change-Id: I243cdfd9a09c567eac5e52abd9980ebf90b94f89
Signed-off-by: Vishwanatha Subbanna <vishwa@linux.vnet.ibm.com>

show more ...


# 2dc9b1a2 18-Aug-2017 Vishwanatha Subbanna <vishwa@linux.vnet.ibm.com>

Enable OCC error monitoring

Fixes openbmc/openbmc#2165

Change-Id: I93f317a32c910f279003fa0fce6ae2d597f90312
Signed-off-by: Vishwanatha Subbanna <vishwa@linux.vnet.ibm.com>


# ee4d83df 29-Jun-2017 Vishwanatha Subbanna <vishwa@linux.vnet.ibm.com>

Add support to watch for OCC errors

Change-Id: I98d95020a2d01e281e5c8efa825d6b4bd4c6c160
Signed-off-by: Vishwanatha Subbanna <vishwa@linux.vnet.ibm.com>