History log of /openbmc/phosphor-health-monitor/ (Results 51 – 75 of 94)
Revision Date Author Comments
(<<< Hide modified files)
(Show modified files >>>)
2a94001125-Jun-2023 Jian Zhang <zhangjian.3032@bytedance.com>

misc: Move to steady_timer

We found that the system clock can affect the deadline timer, causing
the timer to be cancelled if it does not meet expectations, so we
replaced it with steady_timer here,

misc: Move to steady_timer

We found that the system clock can affect the deadline timer, causing
the timer to be cancelled if it does not meet expectations, so we
replaced it with steady_timer here, which many repositories have
already done, and this should have no other effect.

Change-Id: Id0d514100cdedc718c4e9dfc328bae8476aa1aeb
Signed-off-by: Jian Zhang <zhangjian.3032@bytedance.com>

show more ...

658d70aa10-May-2023 Patrick Williams <patrick@stwcx.xyz>

clang-format: copy latest and re-format

clang-format-16 has some backwards incompatible changes that require
additional settings for best compatibility and re-running the formatter.
Copy the latest

clang-format: copy latest and re-format

clang-format-16 has some backwards incompatible changes that require
additional settings for best compatibility and re-running the formatter.
Copy the latest .clang-format from the docs repository and reformat the
repository.

Change-Id: Icdb911b82f317b75b8f147d9e8e0599131c7e601
Signed-off-by: Patrick Williams <patrick@stwcx.xyz>

show more ...

67d4059312-Apr-2023 Patrick Williams <patrick@stwcx.xyz>

meson: remove deprecated get_pkgconfig_variable

Since meson 0.56, the `get_pkgconfig_variable` has been deprecated. In
meson 0.58 the `get_variable` was enhanced to no longer require the
`pkgconfig

meson: remove deprecated get_pkgconfig_variable

Since meson 0.56, the `get_pkgconfig_variable` has been deprecated. In
meson 0.58 the `get_variable` was enhanced to no longer require the
`pkgconfig` keyword argument. Ensure meson 0.58 is required and update
the usage of all `get_pkgconfig_variable` and `get_variable` to be the
modern variant.

Change-Id: I9daae807312503e2ba42b6cae0417c54621db394
Signed-off-by: Patrick Williams <patrick@stwcx.xyz>

show more ...

ea0c9bb816-Mar-2023 Sui Chen <suichen@google.com>

Add "Restart=always" to the systemd unit file

The health-monitor daemon is not one-shot and should restart when it
gets terminated, so adding "Restart=always" to the systemd unit file.

Tested:
Befo

Add "Restart=always" to the systemd unit file

The health-monitor daemon is not one-shot and should restart when it
gets terminated, so adding "Restart=always" to the systemd unit file.

Tested:
Before this change:
root@bmc:~# pidof health-monitor
310
root@bmc:~# kill -9 310
root@bmc:~# pidof health-monitor
root@bmc:~# # (Does not restart)

After this change:
root@bmc:~# pidof health-monitor
12839
root@bmc:~# kill -9 12839
root@bmc:~# pidof health-monitor
12904

Signed-off-By: Sui Chen <suichen@google.com>
Change-Id: I7a2402bdeb2de369cbaac1dd7c698812948a7003

show more ...

a19c6fb206-Mar-2023 Ed Tanous <edtanous@google.com>

Move to boost::asio::post

This allows entity_manager to compile with BOOST_ASIO_NO_DEPRECATED set.
It was functionally changed a few years ago, and is identical to the
other behavior.

Change-Id: Ib

Move to boost::asio::post

This allows entity_manager to compile with BOOST_ASIO_NO_DEPRECATED set.
It was functionally changed a few years ago, and is identical to the
other behavior.

Change-Id: Ib8125fcf9bdddf0bafd46118c4e764f7f83a0c2d
Signed-off-by: Ed Tanous <edtanous@google.com>

show more ...

ec6601d109-Jan-2023 Sui Chen <suichen@google.com>

Fix CPU utilization calculation

Currently, CPU utilization is calculated by (activeTimeDiff /
(activeTimeDiff + idleTimeDiff)), where idleTimeDiff is defined as "idle
time + IO wait time". The idleT

Fix CPU utilization calculation

Currently, CPU utilization is calculated by (activeTimeDiff /
(activeTimeDiff + idleTimeDiff)), where idleTimeDiff is defined as "idle
time + IO wait time". The idleTimeDiff term is incorrect -- it
should be "everything else". As of current, one can have "kernel
utilization", "userspace utilization" and "overall utilization"
reach "100%" simultaneously which is does not make sense.

This change calculates CPU usage as follows:
* Kernel-space usage is "kernel time delta" / "total time delta".
* Userspace usage is "userspace delta" / "total time delta".
* Overall usage delta is "(kernel time delta + userspace delta)" /
"total time delta".

Tested:
Compile the "Explicit sampling version of the SmallPT path tracer"
(https://www.kevinbeason.com/smallpt/explicit.cpp) as a workload,
and run two copies of it on the BMC to fully stress the CPU cores.
(Alternatively, any benchmark program can fulfill this purpose, but for
this one, I understand what it does and know it's compute-bound.)

One can see from `htop`, both CPU cores are almost 100% occupied. Most
(around 90%) CPU time is spent in user-space. Remainder of the CPU
usage is attributable to other tasks and background processing.

When one checks the
`/xyz/openbmc_project/sensors/utilization/CPU_Kernel` and
`/xyz/openbmc_project/sensors/utilization/CPU_User` objects, one can see
CPU_User reading ramp up and reach around 90%. CPU_Kernel stabilizes
at 10%. When `smallpt_explicit` is terminated, kernel and userspace CPU
usage re-converge to their normal values.

Change-Id: I7c0e10e08bd2b6c8b3bd1c1a618fffb2739feecc
Signed-off-By: Sui Chen <suichen@google.com>
Signed-off-by: Patrick Williams <patrick@stwcx.xyz>

show more ...

f7406df006-Dec-2022 Patrick Williams <patrick@stwcx.xyz>

prettier: reformat with org-wide settings

Signed-off-by: Patrick Williams <patrick@stwcx.xyz>
Change-Id: I29a6b362713a04d53e1e0cc14c3316ebd1023ce6

af10994722-Nov-2022 Nan Zhou <nanzhoumails@gmail.com>

sensors: change object_manager path

As per PDI, all sensor implementation shall put object manager at
`/xyz/openbmc_project/sensors`.
[1] https://github.com/openbmc/phosphor-dbus-interfaces/blob/mas

sensors: change object_manager path

As per PDI, all sensor implementation shall put object manager at
`/xyz/openbmc_project/sensors`.
[1] https://github.com/openbmc/phosphor-dbus-interfaces/blob/master/yaml/xyz/openbmc_project/Sensor/Value.interface.yaml#L20

Tested: on hardware, ipmitool sdr elist is working.
```
CPU | 36h | ok | 0.1 |
CPU_Kernel | 37h | ok | 0.1 |
CPU_User | 38h | ok | 0.1 |
Memory_Available | 39h | ok | 0.1 |
Storage_RW | 3Ah | ok | 0.1 |
```

Signed-off-by: Nan Zhou <nanzhoumails@gmail.com>
Change-Id: Ia9b21eebf37cb22cfa4bc19f2369423ea4d5c035

show more ...

9ca0045826-Nov-2022 Patrick Williams <patrick@stwcx.xyz>

sdbusplus: use shorter type aliases

The sdbusplus headers provide shortened aliases for many types.
Switch to using them to provide better code clarity and shorter
lines. Possible replacements are

sdbusplus: use shorter type aliases

The sdbusplus headers provide shortened aliases for many types.
Switch to using them to provide better code clarity and shorter
lines. Possible replacements are for:
* bus_t
* exception_t
* manager_t
* match_t
* message_t
* object_t
* slot_t

Change-Id: Ie38db6cd54835ce4f3a4eae520e9ad834e0751c1
Signed-off-by: Patrick Williams <patrick@stwcx.xyz>

show more ...

b6f22cf722-Nov-2022 Jayashree Dhanapal <jayashree-d@hcl.com>

Remove initial values for Health sensor

For the first 120 (N - Window size), the monitoring is ignored
and stored the default values for health sensor.

As in the boot up, CPU utilization may be hig

Remove initial values for Health sensor

For the first 120 (N - Window size), the monitoring is ignored
and stored the default values for health sensor.

As in the boot up, CPU utilization may be high because of all the
applications started initializing. Therefore, removing initial values
and monitoring the values for each health sensor till the window size
and updating the value in D-Bus.

Tested : Tested and verified in Facebook YosemiteV2 platform.

Signed-off-by: Jayashree Dhanapal <jayashree-d@hcl.com>
Change-Id: I60fc25219337607ec911c1f7f674303dcb8fb63c

show more ...

51bcfcb101-Nov-2021 Sui Chen <suichen@google.com>

Add kernel and user CPU utilization

This adds two utilization sensors, "CPU_Kernel" and "CPU_User" that
are populated with the fraction of CPU processing spent in kernel space
and user space.

The i

Add kernel and user CPU utilization

This adds two utilization sensors, "CPU_Kernel" and "CPU_User" that
are populated with the fraction of CPU processing spent in kernel space
and user space.

The intended use case is BMCWeb can read those sensors from
phosphor-health-monitor and report them in a RedFish response.

The RedFish resources are:
* ManagerDiagnosticData.ProcessorStatistics.KernelPercent
* ManagerDiagnosticData.ProcessorStatistics.UserPercent

Tested: Installed on a QEMU-emulated BMC
bmc# busctl tree xyz.openbmc_project.HealthMon
`-/xyz
`-/xyz/openbmc_project
`-/xyz/openbmc_project/sensors
`-/xyz/openbmc_project/sensors/utilization
|-/xyz/openbmc_project/sensors/utilization/CPU
|-/xyz/openbmc_project/sensors/utilization/CPU_Kernel
`-/xyz/openbmc_project/sensors/utilization/CPU_User

signed-off-by: Sui Chen <suichen@google.com>
Change-Id: Ic6792e4c7cd8eba144eb9adec9366c1bc15b1a44

show more ...

517524a019-Dec-2021 Sui Chen <suichen@google.com>

Change association tuple to "monitoring, monitored_by, bmc"

Per the discussion in 46081, we need a way to differentiate between
"Host CPU usage" and "BMC CPU usage" for the DBus objects. Association

Change association tuple to "monitoring, monitored_by, bmc"

Per the discussion in 46081, we need a way to differentiate between
"Host CPU usage" and "BMC CPU usage" for the DBus objects. Associations
has been used previously for this purpose, so the association for the
health utility "sensors" are changed accordingly for this purpose.

From existing systems, some examples of existing association edge names
may include:
- "chassis": from a DBus sensor to a Chassis/Board inventory
- "all_sensors": the reverse of "chassis"
- "inventory": from a voltage sensor to a board
- "updateable", "active", "functional": from /software to a
particular version
- "software_version": reverse of the above three
- "containedby": from a module to a module container
- "contains": reverse of above

Considering existing association names and the "Requirements and
Expectations for dbus interfaces" document, we name the association
edges "monitoring" and "monitored_by". This association tuple is to be
interpreted as "the utilization sensors are monitoring the current BMC;
the current BMC is monitored by the utilization sensors".

Signed-off-by: Sui Chen <suichen@google.com>
Change-Id: Ib0c634df83beacb7acac1dc7885ba47e58523a79

show more ...

3928a02431-Aug-2022 Andrew Geissler <geissonator@yahoo.com>

monitor size of /tmp

OpenBMC application use /tmp for a variety of reasons. If /tmp becomes
full, then certain applications will start to fail (in very undefined
ways).

/tmp uses a tmpfs filesystem

monitor size of /tmp

OpenBMC application use /tmp for a variety of reasons. If /tmp becomes
full, then certain applications will start to fail (in very undefined
ways).

/tmp uses a tmpfs filesystem that is backed by BMC memory. Just
monitoring BMC memory is not enough though because by default, only 50%
of memory is allowed to be used by /tmp.

This commit adds in a default to monitor /tmp and log to the journal if
it sees /tmp exceed 85% usage. This will at least offer a clue to users
when debugging issues.

Tested:
- Tested as a part of the following commit:
https://gerrit.openbmc.org/c/openbmc/openbmc/+/56065

Signed-off-by: Andrew Geissler <geissonator@yahoo.com>
Change-Id: I1355f0d29a0b7a0840075561069d4ec7d3b27672

show more ...

b7d7bd5a22-Aug-2022 Potin Lai <potin.lai@quantatw.com>

healthMonitor: use MemAvailable for memory utilzation calculation

We notice the memory utilization can easily reach 95% or higher when
large files are being readed, and most memories are consumed by

healthMonitor: use MemAvailable for memory utilzation calculation

We notice the memory utilization can easily reach 95% or higher when
large files are being readed, and most memories are consumed by
filesystem as file cache.

In current memory utilizaion calculation, it calculate "total - free"
as uesd memory size, which includes buff/cache as part of used memory.

Convert method to get MemAvailable and MemTotal by parsing from
/proc/meminfo, then calculate "unavailable" memory percentage
for memory utilization.

Tested results:

root@bletchley:~# free
total used free shared buff/cache available
Mem: 2066620 247320 59108 34412 1760192 1713028
Swap: 0 0 0

root@bletchley:~# busctl introspect xyz.openbmc_project.HealthMon \
> /xyz/openbmc_project/sensors/utilization/Memory \
> xyz.openbmc_project.Sensor.Value
NAME TYPE SIGNATURE RESULT/VALUE FLAGS
.MaxValue property d 100 emits-change writable
.MinValue property d 0 emits-change writable
.Unit property s "xyz.openbmc_project.Sensor.Value.Uni... emits-change writable
.Value property d 14.9148 emits-change writable

Signed-off-by: Potin Lai <potin.lai@quantatw.com>
Change-Id: Ib87edb313bcfbdae1306847babc4c7a5d96766f9

show more ...

c82e616402-Aug-2022 Potin Lai <potin.lai@quantatw.com>

healthMonitor: wait until enough sensor reading in queue

Usually the BMC is very busy during startup and it is risky to fill up
the entire queue with only one initial value.

In this patch, we wait

healthMonitor: wait until enough sensor reading in queue

Usually the BMC is very busy during startup and it is risky to fill up
the entire queue with only one initial value.

In this patch, we wait until the queue is filled with enough sensor
values before doing any checks.

Signed-off-by: Potin Lai <potin.lai@quantatw.com>
Change-Id: I38de9351e88fb666051008170bdeae635fbcd103

show more ...

973c1b6904-Aug-2022 Patrick Williams <patrick@stwcx.xyz>

MAINTAINERS: remove file

The MAINTAINERS file is deprecated in favor of OWNERS.

Signed-off-by: Patrick Williams <patrick@stwcx.xyz>
Change-Id: I444f9ba834bf7f4540ee5ddf99913ab9bdd2f94a


37f8b51302-Aug-2022 Potin Lai <potin.lai@quantatw.com>

bmc_health_config: remove default threshold target unit

Remove reboot.target from default json file to keep same behavior
before target unit call implementation.

Signed-off-by: Potin Lai <potin.lai

bmc_health_config: remove default threshold target unit

Remove reboot.target from default json file to keep same behavior
before target unit call implementation.

Signed-off-by: Potin Lai <potin.lai@quantatw.com>
Change-Id: I1e821731a3f5e36d49bd47b7daa19e254b927d58

show more ...

bbfe718622-Jul-2022 Patrick Williams <patrick@stwcx.xyz>

sdbusplus: use shorter type aliases

The sdbusplus headers provide shortened aliases for many types.
Switch to using them to provide better code clarity and shorter
lines. Possible replacements are

sdbusplus: use shorter type aliases

The sdbusplus headers provide shortened aliases for many types.
Switch to using them to provide better code clarity and shorter
lines. Possible replacements are for:
* bus_t
* exception_t
* manager_t
* match_t
* message_t
* object_t
* slot_t

Signed-off-by: Patrick Williams <patrick@stwcx.xyz>
Change-Id: Ibd0930728f0a945f424e3a3b5d3e8ea9e54f1979

show more ...

85f31f8b22-Jul-2022 Patrick Williams <patrick@stwcx.xyz>

OWNERS: switch 'matches' to 'matchers'

The original OWNERS template had a mistake which used 'matches' instead
of the field supported by the Gerrit plugin 'matchers'. Update the
OWNERS file to have

OWNERS: switch 'matches' to 'matchers'

The original OWNERS template had a mistake which used 'matches' instead
of the field supported by the Gerrit plugin 'matchers'. Update the
OWNERS file to have the correct field.

Signed-off-by: Patrick Williams <patrick@stwcx.xyz>
Change-Id: I6806c276b14e4906840c3b90e7f6aff60a1d3fb7

show more ...

156ecf3111-Jul-2022 Potin Lai <potin.lai@quantatw.com>

healthMonitor: call configured unit when threshold exceeded

Add threshold target implementation.
Call the system target unit configured in config file when threshold
value exceeded.

Signed-off-by:

healthMonitor: call configured unit when threshold exceeded

Add threshold target implementation.
Call the system target unit configured in config file when threshold
value exceeded.

Signed-off-by: Potin Lai <potin.lai@quantatw.com>
Change-Id: I31f0cf4df0c913f10c47ecb9fa24b59b7b9de0e5

show more ...

7fc0aa1e20-Jun-2022 Willy Tu <wltu@google.com>

cleanup: Remove unnecessary info message for dbus matcher

Remove the needUpdate message once it has been set to true.
This is to reduce the amount of spam generated by this daemon.

Change-Id: Id216

cleanup: Remove unnecessary info message for dbus matcher

Remove the needUpdate message once it has been set to true.
This is to reduce the amount of spam generated by this daemon.

Change-Id: Id21675e2cbca89ebb35eb29af053bfd9d917b14a
Signed-off-by: Willy Tu <wltu@google.com>

show more ...

f8d7973711-Mar-2021 Yong Li <yong.b.li@linux.intel.com>

Make health-monitor compatible with “sensor list” command

When running both the health-monitor with dbus-sensors service,
"ipmitool sensor list" command cannot list all sensors and error
reported.
T

Make health-monitor compatible with “sensor list” command

When running both the health-monitor with dbus-sensors service,
"ipmitool sensor list" command cannot list all sensors and error
reported.
The root cause is that health-monitor exports it as an utilization
sensor but incompatible with sensor reading:
Max and min values should be set and max should > min;
Critical/warning values should be set
Object Manager should be registered on dbus "/"

Tested:
Ipmitool sensor list can list all these sensors without error.

Signed-off-by: Yong Li <yong.b.li@linux.intel.com>
Change-Id: Ie9311f153b647e2ddec09aaa3edf448926c21e97

show more ...

a1ed140b21-Mar-2022 Patrick Williams <patrick@stwcx.xyz>

meson: simplify dependencies

Leverage wrapfile `[provide]` directives to simplify the dependency
searching in the meson.build.

Signed-off-by: Patrick Williams <patrick@stwcx.xyz>
Change-Id: Ib36f2c

meson: simplify dependencies

Leverage wrapfile `[provide]` directives to simplify the dependency
searching in the meson.build.

Signed-off-by: Patrick Williams <patrick@stwcx.xyz>
Change-Id: Ib36f2cf3b3547838aaeb89c0867187f248910074

show more ...

a6cd704b21-Dec-2021 Konstantin Aladyshev <aladyshev22@gmail.com>

Set unset threshold values to NaN

Currently unset thresholds are set to 0. But correct value for unset
thresholds is NaN.

Signed-off-by: Konstantin Aladyshev <aladyshev22@gmail.com>
Change-Id: I1fb

Set unset threshold values to NaN

Currently unset thresholds are set to 0. But correct value for unset
thresholds is NaN.

Signed-off-by: Konstantin Aladyshev <aladyshev22@gmail.com>
Change-Id: I1fbf120d5ea4af7e470fa4d9b7d77c6a14a15b96

show more ...

9d29b37821-Dec-2021 Konstantin Aladyshev <aladyshev22@gmail.com>

Set correct values for sensor min and max properties

Interface 'xyz.openbmc_project.Sensor.Value' has 'MinValue' and
'MaxValue' properties. In case of utilization sensors these values
should be set

Set correct values for sensor min and max properties

Interface 'xyz.openbmc_project.Sensor.Value' has 'MinValue' and
'MaxValue' properties. In case of utilization sensors these values
should be set to 0 and 100 accordingly.

Signed-off-by: Konstantin Aladyshev <aladyshev22@gmail.com>
Change-Id: I3e6c75375c200fd1716a044a63782f310d4f914b

show more ...

1234