History log of /openbmc/phosphor-fan-presence/monitor/system.cpp (Results 1 – 25 of 36)
Revision Date Author Comments
# 5e15c3ba 20-Oct-2023 Patrick Williams <patrick@stwcx.xyz>

clang-format: copy latest and re-format

clang-format-17 has some backwards incompatible changes that require
additional settings for best compatibility and re-running the formatter.
Copy the latest

clang-format: copy latest and re-format

clang-format-17 has some backwards incompatible changes that require
additional settings for best compatibility and re-running the formatter.
Copy the latest .clang-format from the docs repository and reformat the
repository.

Change-Id: I3e9e6350864ac267819a4b8d670bef7d3746976e
Signed-off-by: Patrick Williams <patrick@stwcx.xyz>

show more ...


# fbf4703f 17-Jul-2023 Patrick Williams <patrick@stwcx.xyz>

use std::format instead of fmt::format

The std::format is sufficient for the uses in this repository except
for in one file (override_fan_target.cpp, since P2286 isn't supported
by GCC yet). Switch

use std::format instead of fmt::format

The std::format is sufficient for the uses in this repository except
for in one file (override_fan_target.cpp, since P2286 isn't supported
by GCC yet). Switch to std::format whenever possible.

Signed-off-by: Patrick Williams <patrick@stwcx.xyz>
Change-Id: Ib2576fb530a4d7ce238e1b0bd95b40b476ec2107

show more ...


# 18fb12b8 09-May-2023 Matt Spinler <spinler@us.ibm.com>

monitor: Change Fan/Sensor def tuples to structs

The tuples were big and hard to read/use. Change them to structs so the
members are always named.

Signed-off-by: Matt Spinler <spinler@us.ibm.com>

monitor: Change Fan/Sensor def tuples to structs

The tuples were big and hard to read/use. Change them to structs so the
members are always named.

Signed-off-by: Matt Spinler <spinler@us.ibm.com>
Change-Id: I79826563faf44636b251e614f45ff86f1e02c607

show more ...


# 61b73296 10-May-2023 Patrick Williams <patrick@stwcx.xyz>

clang-format: copy latest and re-format

clang-format-16 has some backwards incompatible changes that require
additional settings for best compatibility and re-running the formatter.
Copy the latest

clang-format: copy latest and re-format

clang-format-16 has some backwards incompatible changes that require
additional settings for best compatibility and re-running the formatter.
Copy the latest .clang-format from the docs repository and reformat the
repository.

Change-Id: I152f141a5e8343b92b5ce81d3ca16eec77b5606b
Signed-off-by: Patrick Williams <patrick@stwcx.xyz>

show more ...


# fce14908 13-Jan-2023 Chau Ly <chaul@amperecomputing.com>

monitor: Use host state to decide power state

phosphos-fan-monitor service is using pgood to decide the power state.
When power state is off, phosphor-fan-monitor should not check
functionality of f

monitor: Use host state to decide power state

phosphos-fan-monitor service is using pgood to decide the power state.
When power state is off, phosphor-fan-monitor should not check
functionality of fans. However, with Ampere's Softoff (e.g via power
cycle), it takes long for pgood to change state after the command to
power cycle host is taken, so phosphor-fan-monitor fails to detect the
power state is off, and continues to check functionality. This results
in fans being marked non-functional when host is off during power cycle.
This patch offers a package configuration option for choosing to use
CurrentHostState instead of pgood to decide the power state. When the
CurrentHostState is TransitioningToOff, which is set right after the
power cycle command, the power state will be considered as off.

Signed-off-by: Chau Ly <chaul@amperecomputing.com>
Change-Id: I6f459384b1d536f61c5df787d696412acc04ba02

show more ...


# 751c8beb 13-Jan-2023 Chau Ly <chaul@amperecomputing.com>

monitor: Add delay for host control

Some OpenBMC platforms use dbus-sensor and entity-manager to create
the fan sensors. In those systems, phosphor-fan-monitor starts before
the fan sensors are crea

monitor: Add delay for host control

Some OpenBMC platforms use dbus-sensor and entity-manager to create
the fan sensors. In those systems, phosphor-fan-monitor starts before
the fan sensors are created during BMC boot up. phosphor-fan-monitor
is designed to shutdown the host when there is no fan tach sensors which
is not desirable in this case. This patch supports a package
configuration option delay-host-control to add a desired delay before
phosphor-fan-monitor turns off host. This can be configured to match
with each system timing.

Signed-off-by: Chau Ly <chaul@amperecomputing.com>
Change-Id: I63cd85eb5e6cb04069ce7b4c21c2f4621d243502

show more ...


# 4f472a86 26-Aug-2022 Matt Spinler <spinler@us.ibm.com>

monitor: Use USR1 signal to dump debug data

Similar to what fan control is already doing, this commit adds a handler
for the USR1 signal to write debug data to /tmp/fan_monitor_dump.json.
The data b

monitor: Use USR1 signal to dump debug data

Similar to what fan control is already doing, this commit adds a handler
for the USR1 signal to write debug data to /tmp/fan_monitor_dump.json.
The data being written is the same data saved in an event log - the
current sensor status plus any of the Logger class's logs.

Example output, which shows fan0 recovering from previous faults:
{
"logs": [
...
[
"Aug 26 17:04:47",
"Setting tach sensor /xyz/openbmc_project/sensors/fan_tach/fan0_0 functional state to false. [target = 18000, input = 3446, allowed range = (10600 - NoMax) owned = true]"
],
[
"Aug 26 17:04:47",
"Starting shutdown action 'EPOW Power Off: 60s/60s' due to cause '2 Nonfunctional Fan Rotors'"
],
[
"Aug 26 17:04:47",
"Action EPOW Power Off: 60s/60s: Starting service mode timer"
],
[
"Aug 26 17:04:47",
"Creating event log for faulted fan /xyz/openbmc_project/inventory/system/chassis/motherboard/fan0 sensor /xyz/openbmc_project/sensors/fan_tach/fan0_0"
]
],
"sensors": {
"sensors": {
"/xyz/openbmc_project/sensors/fan_tach/fan0_0": {
"functional": false,
"in_range": true,
"present": true,
"prev_tachs": "[11829,11867,11829,11867,11829,11867,11718,11467]",
"prev_targets": "[18000,9000,9040,10320,0,0,0,0]",
"tach": 11829.0,
"target": 18000,
"ticks": 18
},
"/xyz/openbmc_project/sensors/fan_tach/fan0_1": {
"functional": false,
"in_range": true,
"present": true,
"prev_tachs": "[17857,17772,17857,17772,17201,17045,16741,16375]",
"tach": 17857.0,
"ticks": 20
},
"/xyz/openbmc_project/sensors/fan_tach/fan1_0": {
"functional": true,
"in_range": true,
"present": true,
"prev_tachs": "[11755,11792,11755,11792,11755,11792,11755,11792]",
"prev_targets": "[18000,9000,9040,10320,0,0,0,0]",
"tach": 11755.0,
"target": 18000,
"ticks": 0
},
...
}
}
}

Signed-off-by: Matt Spinler <spinler@us.ibm.com>
Change-Id: I84179f78ec83ca6bab788052d0bebe677c1fd29f

show more ...


# d16d464a 26-Aug-2022 Matt Spinler <spinler@us.ibm.com>

monitor: Capture 'in range' status in sensor FFDC

This field provides information if the current sensor tach reading is
considered healthy or not without having to manually do the math based
on the

monitor: Capture 'in range' status in sensor FFDC

This field provides information if the current sensor tach reading is
considered healthy or not without having to manually do the math based
on the current input and target values that fan monitor does.

Example output:

"/xyz/openbmc_project/sensors/fan_tach/fan0_0": {
"functional": true,
"in_range": false,
"present": true,
"prev_tachs": "[3135,3132,3130,3127,3130,3125,3127,3125]",
"prev_targets": "[9000,9040,10320,0,0,0,0,0]",
"tach": 3135.0,
"target": 9000,
"ticks": 27
}

Signed-off-by: Matt Spinler <spinler@us.ibm.com>
Change-Id: Ifbb6693f84fd20351bffd96c0a04e4e4872c4662

show more ...


# bf8e56f6 29-Jun-2022 Mike Capps <mikepcapps@gmail.com>

meson support: configuration option removal

D-Bus names and paths that were previously configurable at build-time
are now hard-coded and moved to dbus_paths.hpp to reduce the number of
configure-tim

meson support: configuration option removal

D-Bus names and paths that were previously configurable at build-time
are now hard-coded and moved to dbus_paths.hpp to reduce the number of
configure-time options and simplify maintenance.

Signed-off-by: Mike Capps <mikepcapps@gmail.com>
Change-Id: I16d88daad90e747cc40d87c853874b1a5fedf5fa

show more ...


# 87f9adc4 11-Aug-2022 Matt Spinler <spinler@us.ibm.com>

monitor: Add tick count to error log capture

If the 'count' method of looking for fan faults is configured, add the
tick counts to the fan sensor data capture when an error is created.
While the co

monitor: Add tick count to error log capture

If the 'count' method of looking for fan faults is configured, add the
tick counts to the fan sensor data capture when an error is created.
While the count will of course be at the max for the failed sensor, it
will show the counts for the other sensors which could help show if
other rotors are having issues but just haven't hit the thresholds yet.

Signed-off-by: Matt Spinler <spinler@us.ibm.com>
Change-Id: I99a3e2480005244df0a0d2d86a36d6e762304bd7

show more ...


# cb356d48 22-Jul-2022 Patrick Williams <patrick@stwcx.xyz>

sdbusplus: use shorter type aliases

The sdbusplus headers provide shortened aliases for many types.
Switch to using them to provide better code clarity and shorter
lines. Possible replacements are

sdbusplus: use shorter type aliases

The sdbusplus headers provide shortened aliases for many types.
Switch to using them to provide better code clarity and shorter
lines. Possible replacements are for:
* bus_t
* exception_t
* manager_t
* match_t
* message_t
* object_t
* slot_t

Signed-off-by: Patrick Williams <patrick@stwcx.xyz>
Change-Id: I9029cc722e7712633c15436bd3868d8c3209f567

show more ...


# 477b13bd 11-Jul-2022 Mike Capps <mikepcapps@gmail.com>

monitor,sensor-monitor: catch exceptions when creating BMC dumps

catch and log exceptions thrown when creating BMC dumps

Signed-off-by: Mike Capps <mikepcapps@gmail.com>
Change-Id: I986ca3e51302016

monitor,sensor-monitor: catch exceptions when creating BMC dumps

catch and log exceptions thrown when creating BMC dumps

Signed-off-by: Mike Capps <mikepcapps@gmail.com>
Change-Id: I986ca3e51302016886ca8ae571054a5b4260a093

show more ...


# 752f24e4 06-Jul-2022 Matt Spinler <spinler@us.ibm.com>

monitor: Default tach sensors to true

Instead of reading the functional status of the tach sensors out of the
inventory on startup, just default them to true. Any issues with the
fans could then be

monitor: Default tach sensors to true

Instead of reading the functional status of the tach sensors out of the
inventory on startup, just default them to true. Any issues with the
fans could then be rediscovered after the reboot.

This was the original behavior. It was probably changed with the intent
that the shutdown timers could immediately start back up again after a
reboot if things were nonfunctional before.

In practice, we've found that there can be a race between the shutdown
actions turning off the system (due to nonfunctional sensors) and the
sensor objects being marked functional again, even when the only reason
they were nonfunctional before the reboot was because the fan sensor
daemon was turned off before fan monitor on the way down.

For this to make a noticeable change, the shutdown actions/timers would
have to be in progress during the reboot anyway, which is pretty
unlikely.

Worst case, it would extend a shutdown by the time it takes an error to
be rediscovered, which is:

If the 'count' method is configured:
monitor_start_delay + (count_interval * threshold)

If the 'timebased' method is configured:
monitor_start_delay + nonfunc_rotor_error_delay

This has no affect on shutdowns caused by missing fans, as the code
still reads that out of the inventory on startup, plus it can be
instantaneously detected as opposed to being calculated over time.

In summary, extending the shutdown time in very uncommon cases seems
better than mistakenly shutting off a running system, which can be a
huge deal depending on the user.

Signed-off-by: Matt Spinler <spinler@us.ibm.com>
Change-Id: I2840c5f2e79bd734626b4144713e4428af28551a

show more ...


# 808d7fe8 13-Jun-2022 Mike Capps <mikepcapps@gmail.com>

meson support: remove code warnings 1

This commit contains code changes necessary to support the increased
warning level from Meson builds. Most changes are for unused variables.

To keep the review

meson support: remove code warnings 1

This commit contains code changes necessary to support the increased
warning level from Meson builds. Most changes are for unused variables.

To keep the review size manageable, this commit contains only monitor
and presence changes (and top-level json_config.hpp).

Signed-off-by: Mike Capps <mikepcapps@gmail.com>
Change-Id: I7280b512c54e8d5aeba3300764a239f3dcbab14d

show more ...


# 7b34ee0f 04-May-2022 Mike Capps <mikepcapps@gmail.com>

monitor: include previous targets and tachs in PEL

To discover the source of certain fan ramp-up failures, this change
outputs the previous 8 targets and tach readings. The strategy is to see
if har

monitor: include previous targets and tachs in PEL

To discover the source of certain fan ramp-up failures, this change
outputs the previous 8 targets and tach readings. The strategy is to see
if hardware limitations prevent attaining the targets quickly enough.

Signed-off-by: Mike Capps <mikepcapps@gmail.com>
Change-Id: Ia38867986b8a8a651de5d01766393c07d413273c

show more ...


# 683a96c6 27-Apr-2022 Mike Capps <mikepcapps@gmail.com>

monitor: Capture BMC dumps on fan/ambient shutdowns

When fan-monitor or sensor-monitor generates an EPOW, this change
creates a BMC dump after the system is powered off and all error logs
are create

monitor: Capture BMC dumps on fan/ambient shutdowns

When fan-monitor or sensor-monitor generates an EPOW, this change
creates a BMC dump after the system is powered off and all error logs
are created.

Change-Id: Iacdd2d2b388e79988e2536d52497f0e697e1d444
Signed-off-by: Mike Capps <mikepcapps@gmail.com>

show more ...


# b4379a1e 11-Oct-2021 Mike Capps <mikepcapps@gmail.com>

Monitor : handle inventory service offline

Using nameHasOwner and nameOwnerChanged D-Bus signals, a callback is
activated when inventory is started.

There are two primary modes for operation: Compa

Monitor : handle inventory service offline

Using nameHasOwner and nameOwnerChanged D-Bus signals, a callback is
activated when inventory is started.

There are two primary modes for operation: Compatible Interfaces, the
inventory-detection callback will fail, however start() will be called a
second time after EntityManager starts and forces a reload of the proper
config for the machine type. Separately, if no EntityManager exists,
then the callback for Inventory-detection will succeed and use the
default configuration file.

To test: stop fan monitor and inventory services. start monitor, wait
10s, start Inventory, after about 15s you should see the online
detection.

Signed-off-by: Mike Capps <mikepcapps@gmail.com>
Change-Id: I289493a0aabb849abee8ce8de047513e94ee2219

show more ...


# ddb773b2 06-Oct-2021 Patrick Williams <patrick@stwcx.xyz>

catch exceptions as const

Signed-off-by: Patrick Williams <patrick@stwcx.xyz>
Change-Id: Id1b5054d3147c39d98309bc11ed7016d6909e2a6


# 25f0327e 13-Sep-2021 Mike Capps <mikepcapps@gmail.com>

Monitor: Support hwmon service offline during startup

It is possible for fan-monitor to startup before the Hwmonitor service,
causing unhandled exceptions that block system initializatio

Monitor: Support hwmon service offline during startup

It is possible for fan-monitor to startup before the Hwmonitor service,
causing unhandled exceptions that block system initialization. This fix
catches the exception until a proper hwmon presence detector is
deployed.

If the exception is caught, this code change forces a re-subscription
during the poweron event to ensure tach sensors will receive published
updates upon resumption of the hwmon service.

Signed-off-by: Mike Capps <mikepcapps@gmail.com>
Change-Id: I8e696e747c432d7a6f696c5ccd9dab73abf7708f

show more ...


# fdcd5db3 20-May-2021 Mike Capps <mikepcapps@gmail.com>

monitor: Subscribe to tach target and feedback services

Subscribes to nameOwnerChanged signals for the services of the sensor
and target interfaces for each configured fan. If those serv

monitor: Subscribe to tach target and feedback services

Subscribes to nameOwnerChanged signals for the services of the sensor
and target interfaces for each configured fan. If those services go
offline, the fan tach sensors should get marked nonfunctional due to no
longer receiving updated target or feedback values. In this design, we
use the existing method of determining when a fan tach sensor should be
marked nonfunctional to allow a recovery window, wherein a brief
offline/online transition (such as during a restart) will not trigger a
nonfunctional state change.

Change-Id: I0a935ccad5a864dc952d023185356a1ef1226830
Signed-off-by: Mike Capps <mikepcapps@gmail.com>

show more ...


# bb449c1c 14-Jun-2021 Matt Spinler <spinler@us.ibm.com>

monitor: Shut down if no readings at power on

If there are no tach sensors on D-Bus when the power state changes to
on, then create an event log and shut down the system. This is done

monitor: Shut down if no readings at power on

If there are no tach sensors on D-Bus when the power state changes to
on, then create an event log and shut down the system. This is done
because in this case the code is not able to know the fan state - if
there are any present or spinning.

The most likely reason there are no sensors (aside from a glaring error
in the config file) is because the fan controller device driver failed
its probe and was unable to detect it, maybe because the device didn't
have power or there was an I2C problem. To aid in root cause analysis
if this were to occur in the field, the code adds the following FFDC
(First Failure Data Capture) to the event log:

* All of the loaded hwmon drivers, taken from /sys/class/hwmon/*/name
* Failure related lines in dmesg, which is where driver errors would
show up.

Tested: Unbound the fan device driver and then powered on the system.
Also disabled I2C to the fan controller device in simulation and tried a
power on.

Signed-off-by: Matt Spinler <spinler@us.ibm.com>
Change-Id: Ic0b80d67ec79c9401f59324fe1134ff12084112a

show more ...


# 823bc49e 21-Jun-2021 Matthew Barth <msbarth@us.ibm.com>

monitor: Use new JsonConfig object

To simplify handling the loading of config files, use the updated
JsonConfig object that populates the available compatibility values used
when ret

monitor: Use new JsonConfig object

To simplify handling the loading of config files, use the updated
JsonConfig object that populates the available compatibility values used
when retrieving the JSON file and loading it. The given load function is
called if compatibility values are found upon being constructed or after
an interfacesAdded signal is received, which then it can call
`getConfFile` to find the JSON config file to be loaded.

Change-Id: Ifc164d36c036cf0ff810018d40e8de52efc6ca58
Signed-off-by: Matthew Barth <msbarth@us.ibm.com>

show more ...


# f435eb1a 11-May-2021 Matt Spinler <spinler@us.ibm.com>

monitor: Changes for power off errors

When a fan error causes a power off due to a power off action being
triggered, the previous fan error is reposted at the time of the power
off.

monitor: Changes for power off errors

When a fan error causes a power off due to a power off action being
triggered, the previous fan error is reposted at the time of the power
off. For this error, make the following changes that will differentiate
it from the first time it was logged:

1. Change severity to Critical
2. Set POWER_THERMAL_CRITICAL_FAULT=TRUE in the additional data
3. Set SEVERITY_DETAIL=SYSTEM_TERM in the additional data

Certain implementations, such as the IBM one, will take additional
actions based on these changes.

Signed-off-by: Matt Spinler <spinler@us.ibm.com>
Change-Id: I5f36171e58493130114427f9e9fd870cd0d2dd76

show more ...


# 4283c5d5 01-Mar-2021 Matt Spinler <spinler@us.ibm.com>

monitor: Allow missing D-Bus sensors on startup

Now that phosphor-fan-monitor is starting at the multi-user target, it
may be starting before the fan sensor hwmon daemon is able to put t

monitor: Allow missing D-Bus sensors on startup

Now that phosphor-fan-monitor is starting at the multi-user target, it
may be starting before the fan sensor hwmon daemon is able to put the
tach reading sensors on D-Bus. This was causing the TachSensor class
objects to not get created so even if the hwmon tach sensor values did
show up later on D-Bus fan monitor wouldn't notice them.

To fix this, still create the TachSensor objects if the corresponding
hwmon D-Bus objects aren't there, and still set them to functional in
the inventory so that any other monitoring code, such as
phosphor-dbus-monitor, won't shut down the system before the hwmon tach
sensors get a chance to show up on D-Bus, which was happening on
witherspoon when a reboot was done with the power on.

When the monitor delay timer expires to kick off monitoring, a D-Bus
read is forced, and if the hwmon sensors still aren't on D-Bus then the
corresponding TachSensor objects will be set to nonfunctional to start
down the error paths.

Also, when the power state changes to on, instead of blindly setting all
TachSensor objects to functional, again check if their hwmon sensor
values are on D-Bus before doing so.

Signed-off-by: Matt Spinler <spinler@us.ibm.com>
Change-Id: I3e62727296630bf68602b0472328f4613e1a78e3

show more ...


# ba53d3e4 24-Feb-2021 Matthew Barth <msbarth@us.ibm.com>

monitor: Correct spelling in exception message

Signed-off-by: Matthew Barth <msbarth@us.ibm.com>
Change-Id: I9f1c5e01db1fc577c3c43bf1510c00a10a419f36


12