History log of /openbmc/openpower-occ-control/occ_errors.hpp (Results 1 – 15 of 15)
Revision Date Author Comments
# 3ece5b99 10-Jan-2025 Chris Cain <cjcain@us.ibm.com>

Add processor callout when BMC has no comm to an OCC

The non-informational PEL indicating that communication has been lost to
the OCC did not have any callouts.
This commit will add a callout for th

Add processor callout when BMC has no comm to an OCC

The non-informational PEL indicating that communication has been lost to
the OCC did not have any callouts.
This commit will add a callout for the processor that it was unable to
communicate with.

Example:
'''
"Error Details": {
"Message": "BMC failed to communicate with the OCC"
},
"Valid Word Count": "0x09",
"Reference Code": "BD572684",
"Hex Word 2": "00000055",
"Hex Word 3": "2E2D0010",
"Hex Word 4": "00000000",
"Hex Word 5": "00000000",
"Hex Word 6": "00000000",
"Hex Word 7": "00000000",
"Hex Word 8": "00000000",
"Hex Word 9": "00000000",
"Callout Section": {
"Callout Count": "1",
"Callouts": [{
"FRU Type": "Normal Hardware FRU",
"Priority": "Mandatory, replace all with this type as a unit",
"Location Code": "U78DA.ND0.WZS003T-P0-C15",
"Part Number": "F201110",
"CCIN": "5C67",
"Serial Number": "YA39AAAA1828"
}]
}
'''

A
Verified on Rainier.

Change-Id: Ia26668af568238d78263b81d16fd30a849f67134
Signed-off-by: Chris Cain <cjcain@us.ibm.com>

show more ...


# f0295f52 12-Sep-2024 Chris Cain <cjcain@us.ibm.com>

Improve BMC error handling for OCC comm failures

- Delay starting OCC reset until all OCCs have been detected (or
timeout). It will prevent multiple resets from being triggered and to
help detecting

Improve BMC error handling for OCC comm failures

- Delay starting OCC reset until all OCCs have been detected (or
timeout). It will prevent multiple resets from being triggered and to
help detecting when reset is completed (active sensor being set after
reset is complete)
- Wait for PLDM response to OCC reset and HRESET requests and retry if
they fail
- If HRESET returns NOT_READY, collect SBE FFDC and try OCC reset. A
persistent failure will put the system in safe state.

- Prevent overwriting dvfs over-temp filename for p10 and beyond since
that old file is only present in old kernel
- Prevent assert when opening sysfs files. (added catch and then created
an OCC Comm failure PEL, which will force an OCC reset.)
- Check return code after reading sysfs files to confirm success. If
read fails, try reset to recover.

- Updated traces to include which processor/OCC encountered issues.
- Better recovery to close windows that were leaving system in partial
good state.

JIRA: PFES-66
Change-Id: I0b087d0e05bd8562682062e1c662f9e18164a720
Signed-off-by: Chris Cain <cjcain@us.ibm.com>

show more ...


# d7542c83 16-Aug-2024 Patrick Williams <patrick@stwcx.xyz>

clang-format: re-format for clang-18

clang-format-18 isn't compatible with the clang-format-17 output, so we
need to reformat the code with the latest version. The way clang-18
handles lambda forma

clang-format: re-format for clang-18

clang-format-18 isn't compatible with the clang-format-17 output, so we
need to reformat the code with the latest version. The way clang-18
handles lambda formatting also changed, so we have made changes to the
organization default style format to better handle lambda formatting.

See I5e08687e696dd240402a2780158664b7113def0e for updated style.
See Iea0776aaa7edd483fa395e23de25ebf5a6288f71 for clang-18 enablement.

Change-Id: I94e2bfdc8fae9bc14e30c701a0e622709ee9b0fe
Signed-off-by: Patrick Williams <patrick@stwcx.xyz>

show more ...


# 4b82f3e3 22-Apr-2024 Chris Cain <cjcain@us.ibm.com>

Create PEL if unable to get the OCC PDRs

The OCC PDRs are required so occ-control will know when the OCCs are
available and ready to communicate. If the PDRs never become available
there is currentl

Create PEL if unable to get the OCC PDRs

The OCC PDRs are required so occ-control will know when the OCCs are
available and ready to communicate. If the PDRs never become available
there is currently no indication other than the OCC sensors not being
available.
This code change will create a PEL when the PDRs have not been found
after 5 minutes. The service will continue looking for the PDRs after
logging the PEL.

Change-Id: Ia2200273d04b91cba9cad3cf8465cc9cada73428
Signed-off-by: Chris Cain <cjcain@us.ibm.com>

show more ...


# fec4b0b1 04-Jan-2024 Matt Spinler <spinler@us.ibm.com>

Add the .Error. segment to error names

Add the '.Error.' segment in the hardcoded error name strings to match
the standard. This way the ReadFailure error string also matches the
error string gener

Add the .Error. segment to error names

Add the '.Error.' segment in the hardcoded error name strings to match
the standard. This way the ReadFailure error string also matches the
error string generated by report<ReadFailure>().

Change-Id: Ieb64c9754a275ec52afee6346ab34e85c7675333
Signed-off-by: Matt Spinler <spinler@us.ibm.com>

show more ...


# a49c987e 10-May-2023 Patrick Williams <patrick@stwcx.xyz>

clang-format: copy latest and re-format

clang-format-16 has some backwards incompatible changes that require
additional settings for best compatibility and re-running the formatter.
Copy the latest

clang-format: copy latest and re-format

clang-format-16 has some backwards incompatible changes that require
additional settings for best compatibility and re-running the formatter.
Copy the latest .clang-format from the docs repository and reformat the
repository.

Change-Id: I39f8c77091744c8516e043054b4ed7207d85aa08
Signed-off-by: Patrick Williams <patrick@stwcx.xyz>

show more ...


# 9789e71f 25-May-2022 Eddie James <eajames@linux.ibm.com>

Log a PEL for communication, presence mismatch, and safe state errors

Add code to log a PEL in various error scenarios. Refactor some of the
error handling to get the return code out of the driver.

Log a PEL for communication, presence mismatch, and safe state errors

Add code to log a PEL in various error scenarios. Refactor some of the
error handling to get the return code out of the driver.

Signed-off-by: Eddie James <eajames@linux.ibm.com>
Change-Id: Ifd91cfc063718e484ec8886df8357d115c6b41e3

show more ...


# bddcf853 07-Sep-2021 George Liu <liuxiwei@inspur.com>

Add meson build

This commit is to add meson build.
and later, we will remove Autotools and replace it with meson build.

Tested: built openpower-occ-control successfully and Unit test passes.

Signe

Add meson build

This commit is to add meson build.
and later, we will remove Autotools and replace it with meson build.

Tested: built openpower-occ-control successfully and Unit test passes.

Signed-off-by: George Liu <liuxiwei@inspur.com>
Change-Id: I7f5fbc7150194a78f9b36bb256613fc70b834130

show more ...


# bcef3b48 09-Sep-2021 George Liu <liuxiwei@inspur.com>

Replace std::experimental::filesystem with std::filesystem

Signed-off-by: George Liu <liuxiwei@inspur.com>
Change-Id: I91d2f1b7a8858ba8c676b68693863bb35b56dffc


# 774f9af9 19-Mar-2019 Eddie James <eajames@us.ibm.com>

Fix error attribute naming for Linux 5.0

There was a slight change to one of the error attributes as part of the
OCC driver upstreaming process. This commit also adds unit tests for the
error attrib

Fix error attribute naming for Linux 5.0

There was a slight change to one of the error attributes as part of the
OCC driver upstreaming process. This commit also adds unit tests for the
error attributes. This required some refactoring to support the unit
tests.

Resolves openbmc/openbmc#3505

Signed-off-by: Eddie James <eajames@us.ibm.com>
Change-Id: I665b46e44b18befc8a728f7246bcda82f1f1a71c

show more ...


# 94df8c90 14-Sep-2018 Gunnar Mills <gmills@us.ibm.com>

clang-format: Update to match docs repo

Update the .clang-format file and run clang-format-6.0.
This .clang-format matches the example one in
https://github.com/openbmc/docs/blob/master/cpp-style-an

clang-format: Update to match docs repo

Update the .clang-format file and run clang-format-6.0.
This .clang-format matches the example one in
https://github.com/openbmc/docs/blob/master/cpp-style-and-conventions.md#clang-formatting

Change-Id: Id6760866dedbaeafd83ea8ef2e0303e30b8955aa
Signed-off-by: Gunnar Mills <gmills@us.ibm.com>
Signed-off-by: Brad Bishop <bradleyb@fuzziesquirrel.com>

show more ...


# 482e31ff 14-Sep-2017 Eddie James <eajames@us.ibm.com>

Add watches for throttling reported by the OCC

Add Error objects to watch the sysfs entries provided by the OCC hwmon
driver that report various types of throttling. Also needed to add a
boolean inp

Add watches for throttling reported by the OCC

Add Error objects to watch the sysfs entries provided by the OCC hwmon
driver that report various types of throttling. Also needed to add a
boolean input to the Error callback interface.

Resolves openbmc/openbmc#1821

Change-Id: I4425770a92ace0f73024b3dc4c577ce46957a62a
Signed-off-by: Eddie James <eajames@us.ibm.com>

show more ...


# 636577f4 06-Oct-2017 Edward A. James <eajames@us.ibm.com>

Add OCC present count detection and watch

Add a Presence child class of Error to handle detecting the number of
OCCs available. Add an instance of this Presence class if the Device
detects that it i

Add OCC present count detection and watch

Add a Presence child class of Error to handle detecting the number of
OCCs available. Add an instance of this Presence class if the Device
detects that it is the master OCC, since the number of present OCCs is
only reported by the master OCC. When a change to the number of OCCs
reported is detected, compare with the number of OCCs determined to be
active by the Manager, and if there is a mismatch, follow the usual
error path (reset OCC, etc).

Partially resolves openbmc/openbmc#2285
See https://gerrit.openbmc-project.xyz/#/c/7843/

Change-Id: Idbaca52b307992d9b01fe15439ab746ef6d64397
Signed-off-by: Edward A. James <eajames@us.ibm.com>

show more ...


# 2dc9b1a2 18-Aug-2017 Vishwanatha Subbanna <vishwa@linux.vnet.ibm.com>

Enable OCC error monitoring

Fixes openbmc/openbmc#2165

Change-Id: I93f317a32c910f279003fa0fce6ae2d597f90312
Signed-off-by: Vishwanatha Subbanna <vishwa@linux.vnet.ibm.com>


# ee4d83df 29-Jun-2017 Vishwanatha Subbanna <vishwa@linux.vnet.ibm.com>

Add support to watch for OCC errors

Change-Id: I98d95020a2d01e281e5c8efa825d6b4bd4c6c160
Signed-off-by: Vishwanatha Subbanna <vishwa@linux.vnet.ibm.com>