#
dfddd648 |
| 16-Aug-2024 |
Patrick Williams <patrick@stwcx.xyz> |
clang-format: re-format for clang-18
clang-format-18 isn't compatible with the clang-format-17 output, so we need to reformat the code with the latest version. The way clang-18 handles lambda forma
clang-format: re-format for clang-18
clang-format-18 isn't compatible with the clang-format-17 output, so we need to reformat the code with the latest version. The way clang-18 handles lambda formatting also changed, so we have made changes to the organization default style format to better handle lambda formatting.
See I5e08687e696dd240402a2780158664b7113def0e for updated style. See Iea0776aaa7edd483fa395e23de25ebf5a6288f71 for clang-18 enablement.
Change-Id: Ica590f8613f1fb89ab1ca676ac51c1cc7e38d67f Signed-off-by: Patrick Williams <patrick@stwcx.xyz>
show more ...
|
#
3494a579 |
| 28-Nov-2023 |
Matt Spinler <spinler@us.ibm.com> |
monitor: Remove presence check in startMonitor()
Before this fix, there was a window where if a fan was removed after power on but before the monitor start timer expired, no power off actions would
monitor: Remove presence check in startMonitor()
Before this fix, there was a window where if a fan was removed after power on but before the monitor start timer expired, no power off actions would run. This was because the tach sensor value was always zero, so there would be no tach changed signals so no handlers would run to start the countdown to the fan becoming nonfunctional.
The startMonitor() function was calling tachChanged() already manually, but only for present fans. Change the code to call it regardless of if the fan is present or not, so that tachChanged() can properly handle the tach sensor values of zero.
Signed-off-by: Matt Spinler <spinler@us.ibm.com> Change-Id: I554e297aeb3f9efb0007721bd8c6305fa706cfed
show more ...
|
#
fbf4703f |
| 17-Jul-2023 |
Patrick Williams <patrick@stwcx.xyz> |
use std::format instead of fmt::format
The std::format is sufficient for the uses in this repository except for in one file (override_fan_target.cpp, since P2286 isn't supported by GCC yet). Switch
use std::format instead of fmt::format
The std::format is sufficient for the uses in this repository except for in one file (override_fan_target.cpp, since P2286 isn't supported by GCC yet). Switch to std::format whenever possible.
Signed-off-by: Patrick Williams <patrick@stwcx.xyz> Change-Id: Ib2576fb530a4d7ce238e1b0bd95b40b476ec2107
show more ...
|
#
f724c16b |
| 10-May-2023 |
Matt Spinler <spinler@us.ibm.com> |
monitor: Support a separate upper deviation
Add an optional 'upper_deviation' field to the fan monitor config and if supplied it will be used for the allowed deviation when the fan value is over the
monitor: Support a separate upper deviation
Add an optional 'upper_deviation' field to the fan monitor config and if supplied it will be used for the allowed deviation when the fan value is over the target. If not supplied it will work as today and the single deviation value will be used for both the upper and lower bounds.
Signed-off-by: Matt Spinler <spinler@us.ibm.com> Change-Id: I085dc1996832e79b94bd1df3a05681d107f466eb
show more ...
|
#
18fb12b8 |
| 09-May-2023 |
Matt Spinler <spinler@us.ibm.com> |
monitor: Change Fan/Sensor def tuples to structs
The tuples were big and hard to read/use. Change them to structs so the members are always named.
Signed-off-by: Matt Spinler <spinler@us.ibm.com>
monitor: Change Fan/Sensor def tuples to structs
The tuples were big and hard to read/use. Change them to structs so the members are always named.
Signed-off-by: Matt Spinler <spinler@us.ibm.com> Change-Id: I79826563faf44636b251e614f45ff86f1e02c607
show more ...
|
#
466bd228 |
| 25-Jan-2023 |
Matt Spinler <spinler@us.ibm.com> |
monitor: Change 'input' to 'actual' in trace
Change the 'input' label in a trace to 'actual' to make it more obvious that it means the fan's actual speed.
Signed-off-by: Matt Spinler <spinler@us.ib
monitor: Change 'input' to 'actual' in trace
Change the 'input' label in a trace to 'actual' to make it more obvious that it means the fan's actual speed.
Signed-off-by: Matt Spinler <spinler@us.ibm.com> Change-Id: I76686bf8b68ad7b8fb5ad994a78a50eb1a60900c
show more ...
|
#
27cc39f1 |
| 20-Sep-2022 |
Chau Ly <chaul@amperecomputing.com> |
monitor: Support target_path option in configuration file
The target path containing the interface xyz.openbmc_project.control.FanPwm or xyz.openbmc_project.control.FanSpeed can be different between
monitor: Support target_path option in configuration file
The target path containing the interface xyz.openbmc_project.control.FanPwm or xyz.openbmc_project.control.FanSpeed can be different between phosphor-hwmon and dbus-sensor.
In phosphor-hwmon, it is "/xyz/openbmc_project/sensors/fan_tach/". In dbus-sensor, it is "/xyz/openbmc_project/control/fanpwm/".
This commit supports making this path configurable via "target_path" as a full object path. By default, it is the same as the fan tach object path.
Signed-off-by: Chau Ly <chaul@amperecomputing.com> Change-Id: I2658a462dd49a98ad77d684f2927e6ccae21cd15
show more ...
|
#
cb356d48 |
| 22-Jul-2022 |
Patrick Williams <patrick@stwcx.xyz> |
sdbusplus: use shorter type aliases
The sdbusplus headers provide shortened aliases for many types. Switch to using them to provide better code clarity and shorter lines. Possible replacements are
sdbusplus: use shorter type aliases
The sdbusplus headers provide shortened aliases for many types. Switch to using them to provide better code clarity and shorter lines. Possible replacements are for: * bus_t * exception_t * manager_t * match_t * message_t * object_t * slot_t
Signed-off-by: Patrick Williams <patrick@stwcx.xyz> Change-Id: I9029cc722e7712633c15436bd3868d8c3209f567
show more ...
|
#
ae01b5fc |
| 06-Jul-2022 |
Matt Spinler <spinler@us.ibm.com> |
monitor: Don't create errors for missing sensors
Don't start the error timer (so no errors would be created) in the cases where the tach sensor is being made nonfunctional because its value can't be
monitor: Don't create errors for missing sensors
Don't start the error timer (so no errors would be created) in the cases where the tach sensor is being made nonfunctional because its value can't be found on D-Bus, as that error indicates a problem with the fan itself, which is fine.
Also don't check any power off actions in this case either so a running system isn't turned off.
If no sensors at all are on D-Bus, that is already caught during a power on.
Signed-off-by: Matt Spinler <spinler@us.ibm.com> Change-Id: If5cd6d1d444be11b9666500db89b4bb82d7b282a
show more ...
|
#
808d7fe8 |
| 13-Jun-2022 |
Mike Capps <mikepcapps@gmail.com> |
meson support: remove code warnings 1
This commit contains code changes necessary to support the increased warning level from Meson builds. Most changes are for unused variables.
To keep the review
meson support: remove code warnings 1
This commit contains code changes necessary to support the increased warning level from Meson builds. Most changes are for unused variables.
To keep the review size manageable, this commit contains only monitor and presence changes (and top-level json_config.hpp).
Signed-off-by: Mike Capps <mikepcapps@gmail.com> Change-Id: I7280b512c54e8d5aeba3300764a239f3dcbab14d
show more ...
|
#
7b34ee0f |
| 04-May-2022 |
Mike Capps <mikepcapps@gmail.com> |
monitor: include previous targets and tachs in PEL
To discover the source of certain fan ramp-up failures, this change outputs the previous 8 targets and tach readings. The strategy is to see if har
monitor: include previous targets and tachs in PEL
To discover the source of certain fan ramp-up failures, this change outputs the previous 8 targets and tach readings. The strategy is to see if hardware limitations prevent attaining the targets quickly enough.
Signed-off-by: Mike Capps <mikepcapps@gmail.com> Change-Id: Ia38867986b8a8a651de5d01766393c07d413273c
show more ...
|
#
43b4cdeb |
| 15-Feb-2022 |
Matthew Barth <msbarth@us.ibm.com> |
monitor - Run fan status/health change after sensors updated
A bug was found when removing a fan and replugging it that the shutdown counters would not reset causing the system to shutdown after add
monitor - Run fan status/health change after sensors updated
A bug was found when removing a fan and replugging it that the shutdown counters would not reset causing the system to shutdown after additional fans are removed/replugged. This was caused by the fan status/health was being updated prior to the fan and its sensor's functional states being updated. Having the fan status/health updated after the fan and its sensor's functional states updated provides the correct state to the shutdown rules.
Change-Id: I61957dcd43e59ac672b04f3d893339b5f512e718 Signed-off-by: Matthew Barth <msbarth@us.ibm.com>
show more ...
|
#
8af8a62b |
| 04-Feb-2022 |
Mike Capps <mikepcapps@gmail.com> |
Monitor : avoid mapper lookups in Tach Sensor
There is a race condition that can fail in certain situations where mapper has not completed introspection, causing a lookup failure and D-Bus exception
Monitor : avoid mapper lookups in Tach Sensor
There is a race condition that can fail in certain situations where mapper has not completed introspection, causing a lookup failure and D-Bus exception. This change uses a hard-coded service name, thus avoiding the lookup and allowing tach sensors to write status directly to inventory.
Signed-off-by: Mike Capps <mikepcapps@gmail.com> Change-Id: If6ad105055762c6240b36d904d169094088b7c81
show more ...
|
#
8a8aa442 |
| 19-Nov-2021 |
Matthew Barth <msbarth@us.ibm.com> |
monitor: Add ignoring feedback sensors above allowed max
There may be situations where fan rotor feedback speeds above the allowed range could be ignored, essentially ignoring any overspeed detectio
monitor: Add ignoring feedback sensors above allowed max
There may be situations where fan rotor feedback speeds above the allowed range could be ignored, essentially ignoring any overspeed detections. This adds an optional `ignore_above_max` attribute to the fan sensors that can be added, set to 'true', to ignore marking that fan sensor as nonfunctional when its above the allowed max. Without this attribute, the fan sensor must remain within the allowed min/max range to be deemed functional. This is the default behavior.
Change-Id: I4876b4c1963a8a1becd55732ca19496f5df6b003 Signed-off-by: Matthew Barth <msbarth@us.ibm.com>
show more ...
|
#
fdcd5db3 |
| 20-May-2021 |
Mike Capps <mikepcapps@gmail.com> |
monitor: Subscribe to tach target and feedback services
Subscribes to nameOwnerChanged signals for the services of the sensor and target interfaces for each configured fan. If those services go offl
monitor: Subscribe to tach target and feedback services
Subscribes to nameOwnerChanged signals for the services of the sensor and target interfaces for each configured fan. If those services go offline, the fan tach sensors should get marked nonfunctional due to no longer receiving updated target or feedback values. In this design, we use the existing method of determining when a fan tach sensor should be marked nonfunctional to allow a recovery window, wherein a brief offline/online transition (such as during a restart) will not trigger a nonfunctional state change.
Change-Id: I0a935ccad5a864dc952d023185356a1ef1226830 Signed-off-by: Mike Capps <mikepcapps@gmail.com>
show more ...
|
#
9ff48777 |
| 19-Jul-2021 |
Mike Capps <mikepcapps@gmail.com> |
monitor : do not trigger error when inventory unavailable
Under some startup scenarios, fan-monitor can start before the InventoryManager service has populated inventory. This leads to false-positiv
monitor : do not trigger error when inventory unavailable
Under some startup scenarios, fan-monitor can start before the InventoryManager service has populated inventory. This leads to false-positives when marking fans non-functional.
This fix detects D-Bus exceptions and defaults the fans to functional. They subscribe to the interfaces-added signal which will populate the fans with correct values as soon as Inventory Manager completes startup.
Signed-off-by: Mike Capps <mikepcapps@gmail.com> Change-Id: I14bff20da14cba3f5ef4b79763867b5cecab0267
show more ...
|
#
bb449c1c |
| 14-Jun-2021 |
Matt Spinler <spinler@us.ibm.com> |
monitor: Shut down if no readings at power on
If there are no tach sensors on D-Bus when the power state changes to on, then create an event log and shut down the system. This is done because in th
monitor: Shut down if no readings at power on
If there are no tach sensors on D-Bus when the power state changes to on, then create an event log and shut down the system. This is done because in this case the code is not able to know the fan state - if there are any present or spinning.
The most likely reason there are no sensors (aside from a glaring error in the config file) is because the fan controller device driver failed its probe and was unable to detect it, maybe because the device didn't have power or there was an I2C problem. To aid in root cause analysis if this were to occur in the field, the code adds the following FFDC (First Failure Data Capture) to the event log:
* All of the loaded hwmon drivers, taken from /sys/class/hwmon/*/name * Failure related lines in dmesg, which is where driver errors would show up.
Tested: Unbound the fan device driver and then powered on the system. Also disabled I2C to the fan controller device in simulation and tried a power on.
Signed-off-by: Matt Spinler <spinler@us.ibm.com> Change-Id: Ic0b80d67ec79c9401f59324fe1134ff12084112a
show more ...
|
#
ce6820ab |
| 26-May-2021 |
Mike Capps <mikepcapps@gmail.com> |
monitor: Read fan state from dbus inventory upon starting
tach sensors previously defaulted to functional. Now they default to the inventory state. For counter-based sensors, nonfunctional state for
monitor: Read fan state from dbus inventory upon starting
tach sensors previously defaulted to functional. Now they default to the inventory state. For counter-based sensors, nonfunctional state forces the count to exceed the threshold.
Fan's functional state now depends on the number of functional sensors.
test plan: 1) power-off chassis, stop phosphor-fan-monitor service 2) use busctl to make a fan's tach sensor nonfunctional 3) start fan-monitor 4) verify the fan's state == nonfunctional 5) poweron chassis, verify that fan-monitor updates _sensor_ to functional 6) fan should remain nonfunctional 7) set Fan to non-present, then to present again (simulate replacement) 8) observe the fans functional state is now true in inventory
Signed-off-by: Mike Capps <mikepcapps@gmail.com> Change-Id: I0a109a1d85390c0201f8a54942efcd2fb21a2b65
show more ...
|
#
fdfcc679 |
| 01-Jun-2021 |
Matt Spinler <spinler@us.ibm.com> |
monitor: Move count timer into TachSensor
Instead of the Fan class owning the count error detection method timer, this commit moves it into each TachSensor object.
This timer will be stopped until
monitor: Move count timer into TachSensor
Instead of the Fan class owning the count error detection method timer, this commit moves it into each TachSensor object.
This timer will be stopped until the first time Fan::tachChanged() detects an out of range sensor, then it will be started. From that point on, the timer expiration function is what will call Fan::process() to increment/decrement the count. If the count goes back to zero, then the timer will be stopped and Fan::tachChanged() will take back over.
Signed-off-by: Matt Spinler <spinler@us.ibm.com> Change-Id: I1cfc8440d299302b088f53764b71c06ea513690b
show more ...
|
#
cceffdd9 |
| 20-May-2021 |
Matthew Barth <msbarth@us.ibm.com> |
monitor: Tie fan FRU functional state with `set_func_on_present`
Only update the fan FRU to functional when the fan FRU has been replaced or a poweroff->poweron has occured while `set_func_on_presen
monitor: Tie fan FRU functional state with `set_func_on_present`
Only update the fan FRU to functional when the fan FRU has been replaced or a poweroff->poweron has occured while `set_func_on_present` configuration is set.
Signed-off-by: Matthew Barth <msbarth@us.ibm.com> Change-Id: I814b29c7d773a403ccf25b8edb9a196528bfb139
show more ...
|
#
a3584bd2 |
| 29-Mar-2021 |
Matt Spinler <spinler@us.ibm.com> |
monitor: Support set_func_on_present
The set_func_on_present JSON field determines if the fan FRU and rotors should be set to functional immediately when a fan plug is detected. It is optional in t
monitor: Support set_func_on_present
The set_func_on_present JSON field determines if the fan FRU and rotors should be set to functional immediately when a fan plug is detected. It is optional in the JSON file, and defaults to false if not present.
When this is false, a fan will have to spin back up again before it is set back to functional, meaning that if the new fan is faulty and never spins up there may not be another error created. If a faulty fan is plugged when this setting is true, then an error will be created as soon as the configuration allows.
In some system configurations, the functional status on the fan FRU may also control LED status, so setting the FRU to functional on the plug would also turn off a fault LED, which is a desired behavior in the systems that do this.
Signed-off-by: Matt Spinler <spinler@us.ibm.com> Change-Id: If1a8e62a7b390e8137353a7ecc423a60df138315
show more ...
|
#
623635c6 |
| 29-Mar-2021 |
Matt Spinler <spinler@us.ibm.com> |
monitor: Create count timer
When the method for monitoring fan speeds is set to 'count', which is when an up/down counter is used determine when a fan should be considered faulted/nonfunctional, the
monitor: Create count timer
When the method for monitoring fan speeds is set to 'count', which is when an up/down counter is used determine when a fan should be considered faulted/nonfunctional, there needs to be some interval at which the speeds are checked and the count is changed if necessary. Otherwise, if the checks just happened in the tach changed callback as they do today, then a fan with a non-changing rotor speed would never trigger a check.
This commit creates a new timer in the Fan class that is constantly running when the monitor is active if any of the fan's sensors use this mode. In the expiration function, the sensors' tach targets will be compared to their input values and the error count will be incremented/decremented as was previously happening in the tach changed callback.
The interval to use is read from a new 'count_interval' property in the JSON, which defaults to 1 if not present and the count method is enabled.
The timer is started in the startMonitor function, and stopped when a power off is detected.
This commit also added some DEBUG journal traces when the counter on a sensor changes values to help with debugging the new code, and so it can be enabled on a system if desired by enabling debug level tracing.
Signed-off-by: Matt Spinler <spinler@us.ibm.com> Change-Id: I238606e95bb91df93afd6ec7c00bd0577bc603f2
show more ...
|
#
4283c5d5 |
| 01-Mar-2021 |
Matt Spinler <spinler@us.ibm.com> |
monitor: Allow missing D-Bus sensors on startup
Now that phosphor-fan-monitor is starting at the multi-user target, it may be starting before the fan sensor hwmon daemon is able to put the tach read
monitor: Allow missing D-Bus sensors on startup
Now that phosphor-fan-monitor is starting at the multi-user target, it may be starting before the fan sensor hwmon daemon is able to put the tach reading sensors on D-Bus. This was causing the TachSensor class objects to not get created so even if the hwmon tach sensor values did show up later on D-Bus fan monitor wouldn't notice them.
To fix this, still create the TachSensor objects if the corresponding hwmon D-Bus objects aren't there, and still set them to functional in the inventory so that any other monitoring code, such as phosphor-dbus-monitor, won't shut down the system before the hwmon tach sensors get a chance to show up on D-Bus, which was happening on witherspoon when a reboot was done with the power on.
When the monitor delay timer expires to kick off monitoring, a D-Bus read is forced, and if the hwmon sensors still aren't on D-Bus then the corresponding TachSensor objects will be set to nonfunctional to start down the error paths.
Also, when the power state changes to on, instead of blindly setting all TachSensor objects to functional, again check if their hwmon sensor values are on D-Bus before doing so.
Signed-off-by: Matt Spinler <spinler@us.ibm.com> Change-Id: I3e62727296630bf68602b0472328f4613e1a78e3
show more ...
|
#
7d135641 |
| 04-Feb-2021 |
Matt Spinler <spinler@us.ibm.com> |
monitor: Support for running with power off
Put in the remaining changes necessary so that fan monitor doesn't need to be killed when power turns off.
This includes things like: * Support for start
monitor: Support for running with power off
Put in the remaining changes necessary so that fan monitor doesn't need to be killed when power turns off.
This includes things like: * Support for starting before the Present property is on D-Bus. * Support for starting before the config file name is available. * Stopping any running timers when power is turned off. * Checking the power off rules when power turns on.
Most, but not all, of the changes are common between the JSON and YAML modes, but this only truly supported when compiled for JSON.
This also removes the init vs monitor modes of operation, if compiled for JSON.
Signed-off-by: Matt Spinler <spinler@us.ibm.com> Change-Id: Ic2c6848f24511c9dc763227e05bbebb4c8c80cd1
show more ...
|
#
fcb0dbcb |
| 10-Feb-2021 |
Matthew Barth <msbarth@us.ibm.com> |
monitor: Trust group cancel/start monitoring
Any fan monitoring trust groups configured should cancel the monitoring of all fan sensors configured in the trust group for all monitoring methods(timer
monitor: Trust group cancel/start monitoring
Any fan monitoring trust groups configured should cancel the monitoring of all fan sensors configured in the trust group for all monitoring methods(timer based or count based) when the group is determined not to be trusted.
Also, the trust manager should restart monitoring all the sensors in the group by processing each sensor against the configured monitoring method. This matches how each sensor is processed on each tach changed signal where only when a tach changed signal is received does the trust state of the sensor get checked prior to processing the state of the sensor.
Tested: Verify fan sensors in trust group's functional state is correct
Signed-off-by: Matthew Barth <msbarth@us.ibm.com> Change-Id: I7b2ffc538eb1f17017826235353dba85e4f92ebe
show more ...
|