History log of /openbmc/phosphor-fan-presence/monitor/fan.cpp (Results 1 – 25 of 61)
Revision Date Author Comments
# 3494a579 28-Nov-2023 Matt Spinler <spinler@us.ibm.com>

monitor: Remove presence check in startMonitor()

Before this fix, there was a window where if a fan was removed after
power on but before the monitor start timer expired, no power off
actions would

monitor: Remove presence check in startMonitor()

Before this fix, there was a window where if a fan was removed after
power on but before the monitor start timer expired, no power off
actions would run. This was because the tach sensor value was always
zero, so there would be no tach changed signals so no handlers would
run to start the countdown to the fan becoming nonfunctional.

The startMonitor() function was calling tachChanged() already manually,
but only for present fans. Change the code to call it regardless of if
the fan is present or not, so that tachChanged() can properly handle the
tach sensor values of zero.

Signed-off-by: Matt Spinler <spinler@us.ibm.com>
Change-Id: I554e297aeb3f9efb0007721bd8c6305fa706cfed

show more ...


# fbf4703f 17-Jul-2023 Patrick Williams <patrick@stwcx.xyz>

use std::format instead of fmt::format

The std::format is sufficient for the uses in this repository except
for in one file (override_fan_target.cpp, since P2286 isn't supported
by GCC yet). Switch

use std::format instead of fmt::format

The std::format is sufficient for the uses in this repository except
for in one file (override_fan_target.cpp, since P2286 isn't supported
by GCC yet). Switch to std::format whenever possible.

Signed-off-by: Patrick Williams <patrick@stwcx.xyz>
Change-Id: Ib2576fb530a4d7ce238e1b0bd95b40b476ec2107

show more ...


# f724c16b 10-May-2023 Matt Spinler <spinler@us.ibm.com>

monitor: Support a separate upper deviation

Add an optional 'upper_deviation' field to the fan monitor config and if
supplied it will be used for the allowed deviation when the fan value is
over the

monitor: Support a separate upper deviation

Add an optional 'upper_deviation' field to the fan monitor config and if
supplied it will be used for the allowed deviation when the fan value is
over the target. If not supplied it will work as today and the single
deviation value will be used for both the upper and lower bounds.

Signed-off-by: Matt Spinler <spinler@us.ibm.com>
Change-Id: I085dc1996832e79b94bd1df3a05681d107f466eb

show more ...


# 18fb12b8 09-May-2023 Matt Spinler <spinler@us.ibm.com>

monitor: Change Fan/Sensor def tuples to structs

The tuples were big and hard to read/use. Change them to structs so the
members are always named.

Signed-off-by: Matt Spinler <spinler@us.ibm.com>

monitor: Change Fan/Sensor def tuples to structs

The tuples were big and hard to read/use. Change them to structs so the
members are always named.

Signed-off-by: Matt Spinler <spinler@us.ibm.com>
Change-Id: I79826563faf44636b251e614f45ff86f1e02c607

show more ...


# 466bd228 25-Jan-2023 Matt Spinler <spinler@us.ibm.com>

monitor: Change 'input' to 'actual' in trace

Change the 'input' label in a trace to 'actual' to make it more obvious
that it means the fan's actual speed.

Signed-off-by: Matt Spinler <spinler@us.ib

monitor: Change 'input' to 'actual' in trace

Change the 'input' label in a trace to 'actual' to make it more obvious
that it means the fan's actual speed.

Signed-off-by: Matt Spinler <spinler@us.ibm.com>
Change-Id: I76686bf8b68ad7b8fb5ad994a78a50eb1a60900c

show more ...


# 27cc39f1 20-Sep-2022 Chau Ly <chaul@amperecomputing.com>

monitor: Support target_path option in configuration file

The target path containing the interface
xyz.openbmc_project.control.FanPwm or
xyz.openbmc_project.control.FanSpeed can be different between

monitor: Support target_path option in configuration file

The target path containing the interface
xyz.openbmc_project.control.FanPwm or
xyz.openbmc_project.control.FanSpeed can be different between
phosphor-hwmon and dbus-sensor.

In phosphor-hwmon, it is "/xyz/openbmc_project/sensors/fan_tach/".
In dbus-sensor, it is "/xyz/openbmc_project/control/fanpwm/".

This commit supports making this path configurable via "target_path"
as a full object path.
By default, it is the same as the fan tach object path.

Signed-off-by: Chau Ly <chaul@amperecomputing.com>
Change-Id: I2658a462dd49a98ad77d684f2927e6ccae21cd15

show more ...


# cb356d48 22-Jul-2022 Patrick Williams <patrick@stwcx.xyz>

sdbusplus: use shorter type aliases

The sdbusplus headers provide shortened aliases for many types.
Switch to using them to provide better code clarity and shorter
lines. Possible replacements are

sdbusplus: use shorter type aliases

The sdbusplus headers provide shortened aliases for many types.
Switch to using them to provide better code clarity and shorter
lines. Possible replacements are for:
* bus_t
* exception_t
* manager_t
* match_t
* message_t
* object_t
* slot_t

Signed-off-by: Patrick Williams <patrick@stwcx.xyz>
Change-Id: I9029cc722e7712633c15436bd3868d8c3209f567

show more ...


# ae01b5fc 06-Jul-2022 Matt Spinler <spinler@us.ibm.com>

monitor: Don't create errors for missing sensors

Don't start the error timer (so no errors would be created) in the cases
where the tach sensor is being made nonfunctional because its value
can't be

monitor: Don't create errors for missing sensors

Don't start the error timer (so no errors would be created) in the cases
where the tach sensor is being made nonfunctional because its value
can't be found on D-Bus, as that error indicates a problem with the fan
itself, which is fine.

Also don't check any power off actions in this case either so a running
system isn't turned off.

If no sensors at all are on D-Bus, that is already caught during a power
on.

Signed-off-by: Matt Spinler <spinler@us.ibm.com>
Change-Id: If5cd6d1d444be11b9666500db89b4bb82d7b282a

show more ...


# 808d7fe8 13-Jun-2022 Mike Capps <mikepcapps@gmail.com>

meson support: remove code warnings 1

This commit contains code changes necessary to support the increased
warning level from Meson builds. Most changes are for unused variables.

To keep the review

meson support: remove code warnings 1

This commit contains code changes necessary to support the increased
warning level from Meson builds. Most changes are for unused variables.

To keep the review size manageable, this commit contains only monitor
and presence changes (and top-level json_config.hpp).

Signed-off-by: Mike Capps <mikepcapps@gmail.com>
Change-Id: I7280b512c54e8d5aeba3300764a239f3dcbab14d

show more ...


# 7b34ee0f 04-May-2022 Mike Capps <mikepcapps@gmail.com>

monitor: include previous targets and tachs in PEL

To discover the source of certain fan ramp-up failures, this change
outputs the previous 8 targets and tach readings. The strategy is to see
if har

monitor: include previous targets and tachs in PEL

To discover the source of certain fan ramp-up failures, this change
outputs the previous 8 targets and tach readings. The strategy is to see
if hardware limitations prevent attaining the targets quickly enough.

Signed-off-by: Mike Capps <mikepcapps@gmail.com>
Change-Id: Ia38867986b8a8a651de5d01766393c07d413273c

show more ...


# 43b4cdeb 15-Feb-2022 Matthew Barth <msbarth@us.ibm.com>

monitor - Run fan status/health change after sensors updated

A bug was found when removing a fan and replugging it that the shutdown
counters would not reset causing the system to shutdown after add

monitor - Run fan status/health change after sensors updated

A bug was found when removing a fan and replugging it that the shutdown
counters would not reset causing the system to shutdown after additional
fans are removed/replugged. This was caused by the fan status/health was
being updated prior to the fan and its sensor's functional states being
updated. Having the fan status/health updated after the fan and its
sensor's functional states updated provides the correct state to the
shutdown rules.

Change-Id: I61957dcd43e59ac672b04f3d893339b5f512e718
Signed-off-by: Matthew Barth <msbarth@us.ibm.com>

show more ...


# 8af8a62b 04-Feb-2022 Mike Capps <mikepcapps@gmail.com>

Monitor : avoid mapper lookups in Tach Sensor

There is a race condition that can fail in certain situations where
mapper has not completed introspection, causing a lookup failure and
D-Bus exception

Monitor : avoid mapper lookups in Tach Sensor

There is a race condition that can fail in certain situations where
mapper has not completed introspection, causing a lookup failure and
D-Bus exception. This change uses a hard-coded service name, thus
avoiding the lookup and allowing tach sensors to write status directly
to inventory.

Signed-off-by: Mike Capps <mikepcapps@gmail.com>
Change-Id: If6ad105055762c6240b36d904d169094088b7c81

show more ...


# 8a8aa442 19-Nov-2021 Matthew Barth <msbarth@us.ibm.com>

monitor: Add ignoring feedback sensors above allowed max

There may be situations where fan rotor feedback speeds above the
allowed range could be ignored, essentially ignoring any oversp

monitor: Add ignoring feedback sensors above allowed max

There may be situations where fan rotor feedback speeds above the
allowed range could be ignored, essentially ignoring any overspeed
detections. This adds an optional `ignore_above_max` attribute to the
fan sensors that can be added, set to 'true', to ignore marking that fan
sensor as nonfunctional when its above the allowed max. Without this
attribute, the fan sensor must remain within the allowed min/max range
to be deemed functional. This is the default behavior.

Change-Id: I4876b4c1963a8a1becd55732ca19496f5df6b003
Signed-off-by: Matthew Barth <msbarth@us.ibm.com>

show more ...


# fdcd5db3 20-May-2021 Mike Capps <mikepcapps@gmail.com>

monitor: Subscribe to tach target and feedback services

Subscribes to nameOwnerChanged signals for the services of the sensor
and target interfaces for each configured fan. If those serv

monitor: Subscribe to tach target and feedback services

Subscribes to nameOwnerChanged signals for the services of the sensor
and target interfaces for each configured fan. If those services go
offline, the fan tach sensors should get marked nonfunctional due to no
longer receiving updated target or feedback values. In this design, we
use the existing method of determining when a fan tach sensor should be
marked nonfunctional to allow a recovery window, wherein a brief
offline/online transition (such as during a restart) will not trigger a
nonfunctional state change.

Change-Id: I0a935ccad5a864dc952d023185356a1ef1226830
Signed-off-by: Mike Capps <mikepcapps@gmail.com>

show more ...


# 9ff48777 19-Jul-2021 Mike Capps <mikepcapps@gmail.com>

monitor : do not trigger error when inventory unavailable

Under some startup scenarios, fan-monitor can start before the
InventoryManager service has populated inventory. This leads to

monitor : do not trigger error when inventory unavailable

Under some startup scenarios, fan-monitor can start before the
InventoryManager service has populated inventory. This leads to
false-positives when marking fans non-functional.

This fix detects D-Bus exceptions and defaults the fans to functional.
They subscribe to the interfaces-added signal which will populate the
fans with correct values as soon as Inventory Manager completes startup.

Signed-off-by: Mike Capps <mikepcapps@gmail.com>
Change-Id: I14bff20da14cba3f5ef4b79763867b5cecab0267

show more ...


# bb449c1c 14-Jun-2021 Matt Spinler <spinler@us.ibm.com>

monitor: Shut down if no readings at power on

If there are no tach sensors on D-Bus when the power state changes to
on, then create an event log and shut down the system. This is done

monitor: Shut down if no readings at power on

If there are no tach sensors on D-Bus when the power state changes to
on, then create an event log and shut down the system. This is done
because in this case the code is not able to know the fan state - if
there are any present or spinning.

The most likely reason there are no sensors (aside from a glaring error
in the config file) is because the fan controller device driver failed
its probe and was unable to detect it, maybe because the device didn't
have power or there was an I2C problem. To aid in root cause analysis
if this were to occur in the field, the code adds the following FFDC
(First Failure Data Capture) to the event log:

* All of the loaded hwmon drivers, taken from /sys/class/hwmon/*/name
* Failure related lines in dmesg, which is where driver errors would
show up.

Tested: Unbound the fan device driver and then powered on the system.
Also disabled I2C to the fan controller device in simulation and tried a
power on.

Signed-off-by: Matt Spinler <spinler@us.ibm.com>
Change-Id: Ic0b80d67ec79c9401f59324fe1134ff12084112a

show more ...


# ce6820ab 26-May-2021 Mike Capps <mikepcapps@gmail.com>

monitor: Read fan state from dbus inventory upon starting

tach sensors previously defaulted to functional. Now they default to the
inventory state. For counter-based sensors, nonfunction

monitor: Read fan state from dbus inventory upon starting

tach sensors previously defaulted to functional. Now they default to the
inventory state. For counter-based sensors, nonfunctional state forces
the count to exceed the threshold.

Fan's functional state now depends on the number of functional sensors.

test plan:
1) power-off chassis, stop phosphor-fan-monitor service
2) use busctl to make a fan's tach sensor nonfunctional
3) start fan-monitor
4) verify the fan's state == nonfunctional
5) poweron chassis, verify that fan-monitor updates _sensor_ to
functional
6) fan should remain nonfunctional
7) set Fan to non-present, then to present again (simulate
replacement)
8) observe the fans functional state is now true in inventory

Signed-off-by: Mike Capps <mikepcapps@gmail.com>
Change-Id: I0a109a1d85390c0201f8a54942efcd2fb21a2b65

show more ...


# fdfcc679 01-Jun-2021 Matt Spinler <spinler@us.ibm.com>

monitor: Move count timer into TachSensor

Instead of the Fan class owning the count error detection method timer,
this commit moves it into each TachSensor object.

This timer wi

monitor: Move count timer into TachSensor

Instead of the Fan class owning the count error detection method timer,
this commit moves it into each TachSensor object.

This timer will be stopped until the first time Fan::tachChanged()
detects an out of range sensor, then it will be started. From that
point on, the timer expiration function is what will call Fan::process()
to increment/decrement the count. If the count goes back to zero, then
the timer will be stopped and Fan::tachChanged() will take back over.

Signed-off-by: Matt Spinler <spinler@us.ibm.com>
Change-Id: I1cfc8440d299302b088f53764b71c06ea513690b

show more ...


# cceffdd9 20-May-2021 Matthew Barth <msbarth@us.ibm.com>

monitor: Tie fan FRU functional state with `set_func_on_present`

Only update the fan FRU to functional when the fan FRU has been replaced
or a poweroff->poweron has occured while `set_fu

monitor: Tie fan FRU functional state with `set_func_on_present`

Only update the fan FRU to functional when the fan FRU has been replaced
or a poweroff->poweron has occured while `set_func_on_present`
configuration is set.

Signed-off-by: Matthew Barth <msbarth@us.ibm.com>
Change-Id: I814b29c7d773a403ccf25b8edb9a196528bfb139

show more ...


# a3584bd2 29-Mar-2021 Matt Spinler <spinler@us.ibm.com>

monitor: Support set_func_on_present

The set_func_on_present JSON field determines if the fan FRU and rotors
should be set to functional immediately when a fan plug is detected. It

monitor: Support set_func_on_present

The set_func_on_present JSON field determines if the fan FRU and rotors
should be set to functional immediately when a fan plug is detected. It
is optional in the JSON file, and defaults to false if not present.

When this is false, a fan will have to spin back up again before it is
set back to functional, meaning that if the new fan is faulty and never
spins up there may not be another error created. If a faulty fan is
plugged when this setting is true, then an error will be created as soon
as the configuration allows.

In some system configurations, the functional status on the fan FRU may
also control LED status, so setting the FRU to functional on the plug
would also turn off a fault LED, which is a desired behavior in the
systems that do this.

Signed-off-by: Matt Spinler <spinler@us.ibm.com>
Change-Id: If1a8e62a7b390e8137353a7ecc423a60df138315

show more ...


# 623635c6 29-Mar-2021 Matt Spinler <spinler@us.ibm.com>

monitor: Create count timer

When the method for monitoring fan speeds is set to 'count', which is
when an up/down counter is used determine when a fan should be
considered faulted/no

monitor: Create count timer

When the method for monitoring fan speeds is set to 'count', which is
when an up/down counter is used determine when a fan should be
considered faulted/nonfunctional, there needs to be some interval at
which the speeds are checked and the count is changed if necessary.
Otherwise, if the checks just happened in the tach changed callback as
they do today, then a fan with a non-changing rotor speed would never
trigger a check.

This commit creates a new timer in the Fan class that is constantly
running when the monitor is active if any of the fan's sensors use this
mode. In the expiration function, the sensors' tach targets will be
compared to their input values and the error count will be
incremented/decremented as was previously happening in the tach changed
callback.

The interval to use is read from a new 'count_interval' property in the
JSON, which defaults to 1 if not present and the count method is
enabled.

The timer is started in the startMonitor function, and stopped when a
power off is detected.

This commit also added some DEBUG journal traces when the counter on a
sensor changes values to help with debugging the new code, and so it can
be enabled on a system if desired by enabling debug level tracing.

Signed-off-by: Matt Spinler <spinler@us.ibm.com>
Change-Id: I238606e95bb91df93afd6ec7c00bd0577bc603f2

show more ...


# 4283c5d5 01-Mar-2021 Matt Spinler <spinler@us.ibm.com>

monitor: Allow missing D-Bus sensors on startup

Now that phosphor-fan-monitor is starting at the multi-user target, it
may be starting before the fan sensor hwmon daemon is able to put t

monitor: Allow missing D-Bus sensors on startup

Now that phosphor-fan-monitor is starting at the multi-user target, it
may be starting before the fan sensor hwmon daemon is able to put the
tach reading sensors on D-Bus. This was causing the TachSensor class
objects to not get created so even if the hwmon tach sensor values did
show up later on D-Bus fan monitor wouldn't notice them.

To fix this, still create the TachSensor objects if the corresponding
hwmon D-Bus objects aren't there, and still set them to functional in
the inventory so that any other monitoring code, such as
phosphor-dbus-monitor, won't shut down the system before the hwmon tach
sensors get a chance to show up on D-Bus, which was happening on
witherspoon when a reboot was done with the power on.

When the monitor delay timer expires to kick off monitoring, a D-Bus
read is forced, and if the hwmon sensors still aren't on D-Bus then the
corresponding TachSensor objects will be set to nonfunctional to start
down the error paths.

Also, when the power state changes to on, instead of blindly setting all
TachSensor objects to functional, again check if their hwmon sensor
values are on D-Bus before doing so.

Signed-off-by: Matt Spinler <spinler@us.ibm.com>
Change-Id: I3e62727296630bf68602b0472328f4613e1a78e3

show more ...


# 7d135641 04-Feb-2021 Matt Spinler <spinler@us.ibm.com>

monitor: Support for running with power off

Put in the remaining changes necessary so that fan monitor doesn't need
to be killed when power turns off.

This includes things like:

monitor: Support for running with power off

Put in the remaining changes necessary so that fan monitor doesn't need
to be killed when power turns off.

This includes things like:
* Support for starting before the Present property is on D-Bus.
* Support for starting before the config file name is available.
* Stopping any running timers when power is turned off.
* Checking the power off rules when power turns on.

Most, but not all, of the changes are common between the JSON and YAML
modes, but this only truly supported when compiled for JSON.

This also removes the init vs monitor modes of operation, if compiled
for JSON.

Signed-off-by: Matt Spinler <spinler@us.ibm.com>
Change-Id: Ic2c6848f24511c9dc763227e05bbebb4c8c80cd1

show more ...


# fcb0dbcb 10-Feb-2021 Matthew Barth <msbarth@us.ibm.com>

monitor: Trust group cancel/start monitoring

Any fan monitoring trust groups configured should cancel the monitoring
of all fan sensors configured in the trust group for all monitoring

monitor: Trust group cancel/start monitoring

Any fan monitoring trust groups configured should cancel the monitoring
of all fan sensors configured in the trust group for all monitoring
methods(timer based or count based) when the group is determined not to
be trusted.

Also, the trust manager should restart monitoring all the sensors in the
group by processing each sensor against the configured monitoring
method. This matches how each sensor is processed on each tach changed
signal where only when a tach changed signal is received does the trust
state of the sensor get checked prior to processing the state of the
sensor.

Tested:
Verify fan sensors in trust group's functional state is correct

Signed-off-by: Matthew Barth <msbarth@us.ibm.com>
Change-Id: I7b2ffc538eb1f17017826235353dba85e4f92ebe

show more ...


# ac37297d 25-Jan-2021 Matt Spinler <spinler@us.ibm.com>

monitor: Write fan presence changes to journal

These traces were already there, they were just saved in the trace
buffer because the thinking was fan-presence would already write it to

monitor: Write fan presence changes to journal

These traces were already there, they were just saved in the trace
buffer because the thinking was fan-presence would already write it to
the journal. That isn't the case though, if fan-monitor is enabled to
handle fan presence errors, then fan-presence won't be, so it won't
trace things either.

Signed-off-by: Matt Spinler <spinler@us.ibm.com>
Change-Id: I439eed79b72bfb8f5dd83aadc73fecf3e0de0849

show more ...


123