History log of /openbmc/openpower-occ-control/occ_status.cpp (Results 1 – 25 of 53)
Revision Date Author Comments
# 3ece5b99 10-Jan-2025 Chris Cain <cjcain@us.ibm.com>

Add processor callout when BMC has no comm to an OCC

The non-informational PEL indicating that communication has been lost to
the OCC did not have any callouts.
This commit will add a callout for th

Add processor callout when BMC has no comm to an OCC

The non-informational PEL indicating that communication has been lost to
the OCC did not have any callouts.
This commit will add a callout for the processor that it was unable to
communicate with.

Example:
'''
"Error Details": {
"Message": "BMC failed to communicate with the OCC"
},
"Valid Word Count": "0x09",
"Reference Code": "BD572684",
"Hex Word 2": "00000055",
"Hex Word 3": "2E2D0010",
"Hex Word 4": "00000000",
"Hex Word 5": "00000000",
"Hex Word 6": "00000000",
"Hex Word 7": "00000000",
"Hex Word 8": "00000000",
"Hex Word 9": "00000000",
"Callout Section": {
"Callout Count": "1",
"Callouts": [{
"FRU Type": "Normal Hardware FRU",
"Priority": "Mandatory, replace all with this type as a unit",
"Location Code": "U78DA.ND0.WZS003T-P0-C15",
"Part Number": "F201110",
"CCIN": "5C67",
"Serial Number": "YA39AAAA1828"
}]
}
'''

A
Verified on Rainier.

Change-Id: Ia26668af568238d78263b81d16fd30a849f67134
Signed-off-by: Chris Cain <cjcain@us.ibm.com>

show more ...


# 37abe9be 31-Oct-2024 Chris Cain <cjcain@us.ibm.com>

Update occ-control to use lg2 for all logging

Convert existing log<level>() trace statements to lg2::level()

Testing: Verified on Rainier - captured journal traces before and after
commit during bo

Update occ-control to use lg2 for all logging

Convert existing log<level>() trace statements to lg2::level()

Testing: Verified on Rainier - captured journal traces before and after
commit during boots, mode, pcap and ips changes.

Change-Id: I318fa7bf3902c641b0c28b09190db4b61d0a2fa9
Signed-off-by: Chris Cain <cjcain@us.ibm.com>

show more ...


# 1fe436dc 10-Oct-2024 Chris Cain <cjcain@us.ibm.com>

Disable Idle Power Saver support in efficiency modes

IPS will not be published on DBUS when the PowerMode is set to one of
the newer efficiency modes:
EfficiencyFavorPower
EfficiencyFavorPerform

Disable Idle Power Saver support in efficiency modes

IPS will not be published on DBUS when the PowerMode is set to one of
the newer efficiency modes:
EfficiencyFavorPower
EfficiencyFavorPerformance (OEM only)

This will prevent the Redfish interface from allowing GET/PATCH
commands and provides a way for the GUI to know when to suppress
B
displaying the parameters on the customer GUI.

Testing:
Verified on Rainier.

When in a non-efficiency mode, the Redfish query returns IPS data:
(GET /redfish/v1/Systems/system)
'''
...
"Id": "system",
"IdlePowerSaver": {
"Enabled": true,
"EnterDwellTimeSeconds": 240,
"EnterUtilizationPercent": 8,
"ExitDwellTimeSeconds": 10,
"ExitUtilizationPercent": 12
},
"IndicatorLED": "Off",
...
'''

When in an efficiency mode, the Redfish query does not return IPS data.
'''
...
"Id": "system",
"IndicatorLED": "Off",
...
'''

Verified across re-ipls, OCC resets, app restarts.

Change-Id: I45bb0d8e97dab33a1a66c0d791f7bb4848bfce41
Signed-off-by: Chris Cain <cjcain@us.ibm.com>

show more ...


# f0295f52 12-Sep-2024 Chris Cain <cjcain@us.ibm.com>

Improve BMC error handling for OCC comm failures

- Delay starting OCC reset until all OCCs have been detected (or
timeout). It will prevent multiple resets from being triggered and to
help detecting

Improve BMC error handling for OCC comm failures

- Delay starting OCC reset until all OCCs have been detected (or
timeout). It will prevent multiple resets from being triggered and to
help detecting when reset is completed (active sensor being set after
reset is complete)
- Wait for PLDM response to OCC reset and HRESET requests and retry if
they fail
- If HRESET returns NOT_READY, collect SBE FFDC and try OCC reset. A
persistent failure will put the system in safe state.

- Prevent overwriting dvfs over-temp filename for p10 and beyond since
that old file is only present in old kernel
- Prevent assert when opening sysfs files. (added catch and then created
an OCC Comm failure PEL, which will force an OCC reset.)
- Check return code after reading sysfs files to confirm success. If
read fails, try reset to recover.

- Updated traces to include which processor/OCC encountered issues.
- Better recovery to close windows that were leaving system in partial
good state.

JIRA: PFES-66
Change-Id: I0b087d0e05bd8562682062e1c662f9e18164a720
Signed-off-by: Chris Cain <cjcain@us.ibm.com>

show more ...


# d7542c83 16-Aug-2024 Patrick Williams <patrick@stwcx.xyz>

clang-format: re-format for clang-18

clang-format-18 isn't compatible with the clang-format-17 output, so we
need to reformat the code with the latest version. The way clang-18
handles lambda forma

clang-format: re-format for clang-18

clang-format-18 isn't compatible with the clang-format-17 output, so we
need to reformat the code with the latest version. The way clang-18
handles lambda formatting also changed, so we have made changes to the
organization default style format to better handle lambda formatting.

See I5e08687e696dd240402a2780158664b7113def0e for updated style.
See Iea0776aaa7edd483fa395e23de25ebf5a6288f71 for clang-18 enablement.

Change-Id: I94e2bfdc8fae9bc14e30c701a0e622709ee9b0fe
Signed-off-by: Patrick Williams <patrick@stwcx.xyz>

show more ...


# 48002498 13-Feb-2024 Patrick Williams <patrick@stwcx.xyz>

prefer std::format over fmt

Switch to std::format to remove the dependency on fmt.

Signed-off-by: Patrick Williams <patrick@stwcx.xyz>
Change-Id: Id3a1295ba8a90fb756cfc500892dcc5b3235e27b


# c86d80fa 04-May-2023 Chris Cain <cjcain@us.ibm.com>

Implement processor throttle dbus properties

- create processor throttle dbus objects for each OCC (processor)
- update throttle properties based on OCC poll response data
or safe mode status.

Th

Implement processor throttle dbus properties

- create processor throttle dbus objects for each OCC (processor)
- update throttle properties based on OCC poll response data
or safe mode status.

Throttle data will be made available via Redfish

NAME TYPE SIGNATURE RESULT/VALUE FLAGS
org.freedesktop.DBus.Introspectable interface - - -
.Introspect method - s -
org.freedesktop.DBus.Peer interface - - -
.GetMachineId method - s -
.Ping method - - -
org.freedesktop.DBus.Properties interface - - -
.Get method ss v -
.GetAll method s a{sv} -
.Set method ssv - -
.PropertiesChanged signal sa{sv}as - -
xyz.openbmc_project.Control.Power.Throttle interface - - -
.ThrottleCauses property as 0 emits-change
.Throttled property b false emits-change

Example of throttled processor (due to a power limit):
as 1 "xyz.openbmc_project.Control.Power.Throttle.ThrottleReasons.PowerLimit"

Change-Id: I0af9d82fab9d694427d0adaa45f4a372d25fbc12
Signed-off-by: Chris Cain <cjcain@us.ibm.com>

show more ...


# a49c987e 10-May-2023 Patrick Williams <patrick@stwcx.xyz>

clang-format: copy latest and re-format

clang-format-16 has some backwards incompatible changes that require
additional settings for best compatibility and re-running the formatter.
Copy the latest

clang-format: copy latest and re-format

clang-format-16 has some backwards incompatible changes that require
additional settings for best compatibility and re-running the formatter.
Copy the latest .clang-format from the docs repository and reformat the
repository.

Change-Id: I39f8c77091744c8516e043054b4ed7207d85aa08
Signed-off-by: Patrick Williams <patrick@stwcx.xyz>

show more ...


# 52cce032 09-Feb-2023 Chris Cain <cjcain@us.ibm.com>

Remove Hypervisor check for PowerVM target

Remove the dbus query to determine if system is PowerVM.
It is assumed that all systems are PowerVM.

Change-Id: I50d3bb76b469383618aa280c591459674f337130

Remove Hypervisor check for PowerVM target

Remove the dbus query to determine if system is PowerVM.
It is assumed that all systems are PowerVM.

Change-Id: I50d3bb76b469383618aa280c591459674f337130
Signed-off-by: Chris Cain <cjcain@us.ibm.com>

show more ...


# af40808f 22-Jul-2022 Patrick Williams <patrick@stwcx.xyz>

sdbusplus: use shorter type aliases

The sdbusplus headers provide shortened aliases for many types.
Switch to using them to provide better code clarity and shorter
lines. Possible replacements are

sdbusplus: use shorter type aliases

The sdbusplus headers provide shortened aliases for many types.
Switch to using them to provide better code clarity and shorter
lines. Possible replacements are for:
* bus_t
* exception_t
* manager_t
* match_t
* message_t
* object_t
* slot_t

Signed-off-by: Patrick Williams <patrick@stwcx.xyz>
Change-Id: I9541d521bf67882215a4a66dce020e38ac2df065

show more ...


# 9789e71f 25-May-2022 Eddie James <eajames@linux.ibm.com>

Log a PEL for communication, presence mismatch, and safe state errors

Add code to log a PEL in various error scenarios. Refactor some of the
error handling to get the return code out of the driver.

Log a PEL for communication, presence mismatch, and safe state errors

Add code to log a PEL in various error scenarios. Refactor some of the
error handling to get the return code out of the driver.

Signed-off-by: Eddie James <eajames@linux.ibm.com>
Change-Id: Ifd91cfc063718e484ec8886df8357d115c6b41e3

show more ...


# c8dd4599 12-May-2022 Sheldon Bailey <baileysh@us.ibm.com>

Fix Max fan speed with Deconfigured DIMM

Fan Speeds Remain at Maximum Speed With Deconfigured DIMMs
Tested: Injected Errors on DIMMs.
Signed-off-by: Sheldon Bailey <baileysh@us.ibm.com>
Change-Id: I

Fix Max fan speed with Deconfigured DIMM

Fan Speeds Remain at Maximum Speed With Deconfigured DIMMs
Tested: Injected Errors on DIMMs.
Signed-off-by: Sheldon Bailey <baileysh@us.ibm.com>
Change-Id: I797698f888cd9fb431d4f490c144eb5ad942bcd8

show more ...


# 6d8f37a2 29-Apr-2022 Chris Cain <cjcain@us.ibm.com>

Ensure occActive called after device enabled

I noticed that the power cap bounds was not getting updated. This
was caused because the device.master() call was made before setting the
occActive flag.

Ensure occActive called after device enabled

I noticed that the power cap bounds was not getting updated. This
was caused because the device.master() call was made before setting the
occActive flag.

The device object now prevents the device from triggering sysfs reads
if the occActive flag is not set.
This change will set occActive right after enabling the device.

Also prevent extra calls to checkAllActiveSensors by only setting the
waitingForAll flag once.

Change-Id: If19771e53b64dd1d5e27e7ae153bd56e0e070aff
Signed-off-by: Chris Cain <cjcain@us.ibm.com>

show more ...


# bd551de3 26-Apr-2022 Chris Cain <cjcain@us.ibm.com>

Queue active sensor updates before status objs created

If an OCC Active sensor update comes in before the Status objects have
been created, the value is lost. This commit will save that sensor
valu

Queue active sensor updates before status objs created

If an OCC Active sensor update comes in before the Status objects have
been created, the value is lost. This commit will save that sensor
value until the status objects are created.
Also, prevent querying the master sysfs file if OCCs are not yet activated.

Change-Id: I8c0d2686024278ec7435e19cf99f59802e1ecceb
Signed-off-by: Chris Cain <cjcain@us.ibm.com>

show more ...


# aced3098 22-Apr-2022 Eddie James <eajames@linux.ibm.com>

Switch to setting Device active

With the latest Linux driver, the hwmon device is always bound, so
instead use the occ_active attribute to tell the driver when the
OCC has gone active.

Signed-off-b

Switch to setting Device active

With the latest Linux driver, the hwmon device is always bound, so
instead use the occ_active attribute to tell the driver when the
OCC has gone active.

Signed-off-by: Eddie James <eajames@linux.ibm.com>
Change-Id: Ia89b517d90a1947f87bb5937bf0d4c1b50191201

show more ...


# bae4d07e 28-Feb-2022 Chris Cain <cjcain@us.ibm.com>

Wait for OCC Active PDR to become available

occ-control was attempting to communicate with OCCs before the OCC
Active sensor PDR was available. This commit will wait until all PDRs
are available bef

Wait for OCC Active PDR to become available

occ-control was attempting to communicate with OCCs before the OCC
Active sensor PDR was available. This commit will wait until all PDRs
are available before allowing communication with the OCCs.
It will also read the OCC Active sensor to determine the state before
allowing communication.

When Hostboot moves a system to safe mode it will set the PLDM state of
the OCC Active sensor to PLDM_STATE_SET_OPERATIONAL_RUNNING_STATUS_DORMANT
This commit will also handle processing of that DORMANT status.

Change-Id: I348cf14455d9373898deec2ac4a04b3752e81c4a
Signed-off-by: Chris Cain <cjcain@us.ibm.com>

show more ...


# c567dc8d 01-Apr-2022 Chris Cain <cjcain@us.ibm.com>

Prevent termination on OCC open/read/write errors

Instead of creating an elog and terminating, the code will now retry on
a Open/Read/Write failure and if it continues to fail, it will return a
Comm

Prevent termination on OCC open/read/write errors

Instead of creating an elog and terminating, the code will now retry on
a Open/Read/Write failure and if it continues to fail, it will return a
CommFailure to the caller. Then the caller of can determine how to
handle (reset PM complex or ignore).
Normally all communication errors with the OCC should trigger an OCC
reset.

Change-Id: Ibf272270156edae565231ca429f15e8fca2f15bc
Signed-off-by: Chris Cain <cjcain@us.ibm.com>

show more ...


# 373af757 21-Feb-2022 Sheldon Bailey <baileysh@us.ibm.com>

openpower-occ-control:failure to read OCC state.

Failure to read OCC state set the OCCs sensors to Nan/Not Functional
Tested: cronus error inject on OCC with and without OCC resets.
Signed-off-by: S

openpower-occ-control:failure to read OCC state.

Failure to read OCC state set the OCCs sensors to Nan/Not Functional
Tested: cronus error inject on OCC with and without OCC resets.
Signed-off-by: Sheldon Bailey <baileysh@us.ibm.com>
Change-Id: I2a6bb6a431f09ea816979b3a482b54a28e21db53
Signed-off-by: Sheldon Bailey <baileysh@us.ibm.com>

show more ...


# e2d0a43c 28-Mar-2022 Chris Cain <cjcain@us.ibm.com>

Handle file errors when device path gets unbound

occ-control was asserting when the hwmon path would get removed due to
device being unbound. This change will gracefully handle the path
getting remo

Handle file errors when device path gets unbound

occ-control was asserting when the hwmon path would get removed due to
device being unbound. This change will gracefully handle the path
getting removed / added back.

Tested on Raininer by unbind/bind of devices:
echo occ-hwmon.2 > /sys/bus/platform/drivers/occ-hwmon/unbind
echo occ-hwmon.2 > /sys/bus/platform/drivers/occ-hwmon/bind

Change-Id: I46fd2c2c54868ffb8183d3dc49cd0c2751165d3b
Signed-off-by: Chris Cain <cjcain@us.ibm.com>

show more ...


# 40501a23 14-Mar-2022 Chris Cain <cjcain@us.ibm.com>

Monitor for power cap changes when occ-control started

The object that is monitoring for power cap changes needs to get created
when occ-control is started so that changes are not missed.
Added vali

Monitor for power cap changes when occ-control started

The object that is monitoring for power cap changes needs to get created
when occ-control is started so that changes are not missed.
Added validation check to ensure that the value sent to the OCC matches
the user configured value on dbus.

Verified on Raininer

Change-Id: I0d77f4569e5459ff58d6fc72147153133d2104a3
Signed-off-by: Chris Cain <cjcain@us.ibm.com>

show more ...


# 5d66a0aa 09-Feb-2022 Chris Cain <cjcain@us.ibm.com>

Expose power cap min/max on dbus

The min and max power cap values will be read from sysfs files and then
the data will be put on the dbus. This will allow users to know the
valid range of power cap

Expose power cap min/max on dbus

The min and max power cap values will be read from sysfs files and then
the data will be put on the dbus. This will allow users to know the
valid range of power cap available.

The PowerCap object was moved from Manager to Status.

Change-Id: I5196cc8645f84c31a5282cf844109bae47b09bf7
Signed-off-by: Chris Cain <cjcain@us.ibm.com>

show more ...


# 1be4337b 09-Dec-2021 Chris Cain <cjcain@us.ibm.com>

Read default power mode & IPS parameters from DBus

If the power mode or idle power saver parameters have never been set,
they will be read from a set of default parameters that are on DBus.
This all

Read default power mode & IPS parameters from DBus

If the power mode or idle power saver parameters have never been set,
they will be read from a set of default parameters that are on DBus.
This allows individual systems to have different default settings.
Mode/IPS parameters moved from Settings to OCC.Control

Code verified on Everest hardware with multiple scenarios including
host reboots, BMC reboots, OCC resets.

Change-Id: Id02613455db7f382791ff37c7dc420dbea1e0906
Signed-off-by: Chris Cain <cjcain@us.ibm.com>

show more ...


# 6fa848a9 24-Jan-2022 Chris Cain <cjcain@us.ibm.com>

Fix occ-control terminations

1. Ensure PowerMode object created before attempting to use
2. Only call setChassisAssociation after getting successful temperature
reading.

Tested on Everest hw

Chang

Fix occ-control terminations

1. Ensure PowerMode object created before attempting to use
2. Only call setChassisAssociation after getting successful temperature
reading.

Tested on Everest hw

Change-Id: I308c072cf5ab0235086c136ba7644125de0a8c6a
Signed-off-by: Chris Cain <cjcain@us.ibm.com>

show more ...


# 36f9cded 22-Nov-2021 Chris Cain <cjcain@us.ibm.com>

Add support for OEM Power Modes

- Allow mode to be set via PassThrough interface
- Allow non-customer OEM power modes to be persisted
- Persist any OEM power mode settings
- moved mode related code

Add support for OEM Power Modes

- Allow mode to be set via PassThrough interface
- Allow non-customer OEM power modes to be persisted
- Persist any OEM power mode settings
- moved mode related code from Status to PowerMode object
- merged PowerIPS into PowerMode object

Tested on Everest and Rainier.
Setting mode through PassThrough/ce-login:
busctl call org.open_power.OCC.Control /org/open_power/control/occ0 org.open_power.OCC.PassThrough SetMode yq 11 3600
Trace (via PassThrough interface)
openpower-occ-control[4440]: PassThrough::setMode() Setting Power Mode 11 (data: 3600)
openpower-occ-control[4440]: PowerMode::sendModeChange: SET_MODE(11,3600) command to OCC0 (9 bytes)
Trace (setting mode via GUI/Redfish):
openpower-occ-control[4440]: Power Mode Change Requested: xyz.openbmc_project.Control.Power.Mode.PowerMode.MaximumPerformance
openpower-occ-control[4440]: PowerMode::sendModeChange: SET_MODE(12,0) command to OCC0 (9 bytes)
Verified when system in any OEM mode that Redfish also reports OEM
Verified all modes are persisted across PM Complex resets and reboots

Change-Id: Idd0be05cb6fd74dbd0776145f212c49addd1c365
Signed-off-by: Chris Cain <cjcain@us.ibm.com>

show more ...


# a7b74dc3 10-Nov-2021 Chris Cain <cjcain@us.ibm.com>

Clean up error cases on boot or when app restarted

- Detect when no or duplicate masters are found (force reset)
- Prevent sending commands to OCC when OCC is disabled
- Detect OCC state changes to

Clean up error cases on boot or when app restarted

- Detect when no or duplicate masters are found (force reset)
- Prevent sending commands to OCC when OCC is disabled
- Detect OCC state changes to SAFE (after 60 sec force reset)

Tested on Everest and Rainier hardware

Signed-off-by: Chris Cain <cjcain@us.ibm.com>
Change-Id: I490f182405e11da207b42a0607a532566479bfd9

show more ...


123