73d2ac96 | 17-Feb-2022 |
Andrew Geissler <geissonator@yahoo.com> |
crit-service: start bmc quiesce target on fail
This target will be monitored by the BMC state target and used to tell external clients when the BMC is in a bad state due to a critical service failin
crit-service: start bmc quiesce target on fail
This target will be monitored by the BMC state target and used to tell external clients when the BMC is in a bad state due to a critical service failing
Signed-off-by: Andrew Geissler <geissonator@yahoo.com> Change-Id: Ibf0460bef9b3ac2a96e8a294e6de122463530713
show more ...
|
21d305e2 | 11-Feb-2022 |
Andrew Geissler <geissonator@yahoo.com> |
crit-service: create bmc dump on failure
When a critical service fails, request a BMC dump to assist in debug of why the service failed.
Tested: - Caused critical service to fail and verified bmc d
crit-service: create bmc dump on failure
When a critical service fails, request a BMC dump to assist in debug of why the service failed.
Tested: - Caused critical service to fail and verified bmc dump was created
Signed-off-by: Andrew Geissler <geissonator@yahoo.com> Change-Id: I2f56af0ee43b84e4142cc29f3c9ddbec053b8334
show more ...
|
e6841034 | 11-Feb-2022 |
Andrew Geissler <geissonator@yahoo.com> |
crit-service: add failed unit to error log
Whether it's a systemd target or a service, they are both units. Log that unit name to both error logs so it's easy to tell what failed.
Signed-off-by: An
crit-service: add failed unit to error log
Whether it's a systemd target or a service, they are both units. Log that unit name to both error logs so it's easy to tell what failed.
Signed-off-by: Andrew Geissler <geissonator@yahoo.com> Change-Id: I32fc45ece1e9ca9b986ead3c9675cf71857b7fcd
show more ...
|
f3870c62 | 10-Feb-2022 |
Andrew Geissler <geissonator@yahoo.com> |
crit-service: create error on failed service
This enhances the existing code to support logging an error when a monitored service fails. The same systemd event is triggered for a target failure and
crit-service: create error on failed service
This enhances the existing code to support logging an error when a monitored service fails. The same systemd event is triggered for a target failure and a service failure so no new logic is needed in that area.
Tested: - Repeatedly killed the host-state service until its unit went into the failed state. Verified this was detected and the expected log was created.
Signed-off-by: Andrew Geissler <geissonator@yahoo.com> Change-Id: I459dd8c35ceddec986336fee635fdf691257f758
show more ...
|
9e3afdf0 | 10-Feb-2022 |
Andrew Geissler <geissonator@yahoo.com> |
crit-service: initial service and parsing
This is an initial commit in a series of commits that will introduce a service monitoring feature within the current target monitoring function.
This new f
crit-service: initial service and parsing
This is an initial commit in a series of commits that will introduce a service monitoring feature within the current target monitoring function.
This new feature will allow a user to pass in a json file with systemd service names that they wish this function to monitor. If a monitored services goes into an error state (exhausted all retries and service has been stopped) then the monitor service will create an error and collect appropriate debug data.
This commit focuses on defining the new json service file and adapting the existing target monitor to take this file as input. Future commits in this series will build on this.
Tested: - Verified new service json could be input to application and it was parsed correctly
Signed-off-by: Andrew Geissler <geissonator@yahoo.com> Change-Id: Ifcc512b8fc868e2a1004a184fa50e4d4c826e8ee
show more ...
|
6810ad50 | 23-Feb-2022 |
Thang Tran <thuutran@amperecomputing.com> |
check requested state before update properties
When the power policy is "Restore" or "Always-Off", software only updates properties if current requested state is "On". This action is necessary to av
check requested state before update properties
When the power policy is "Restore" or "Always-Off", software only updates properties if current requested state is "On". This action is necessary to avoid redundant tasks when AC power.
Tested: 1. Turn Off the HOST and set power policy to Always-Off ipmitool chassis power off ipmitool chassis policy always-off 2. AC power, wait until BMC has been rebooted. 3. HOST is OFF, check journalctl log. BMC did not request to update requested state. 4. Turn On the HOST, AC power, wait until BMC has been rebooted. 5. HOST is OFF, check journalctl log. BMC requests to update requested state to "Off". 6. Set power policy to Previous. ipmitool chassis policy previous 7. AC power, wait until BMC has been rebooted. 8. HOST is OFF, check journalctl log. BMC did not request to update requested state. 9. Turn On the HOST, AC power, wait until BMC has been rebooted. 10. HOST is ON, check journalctl log. BMC requests to update requested state to "On".
Signed-off-by: Thang Tran <thuutran@amperecomputing.com> Change-Id: I14f89d7cdd3911dc74f693aadb60ae04e6819bc2
show more ...
|
d93da775 | 21-Feb-2022 |
Andrew Geissler <geissonator@yahoo.com> |
add namespace on setProperty
The function was moved to a utility interface so need the namespace added
Signed-off-by: Andrew Geissler <geissonator@yahoo.com> Change-Id: Ia6fccbec909ad83bd5f1d6a335b
add namespace on setProperty
The function was moved to a utility interface so need the namespace added
Signed-off-by: Andrew Geissler <geissonator@yahoo.com> Change-Id: Ia6fccbec909ad83bd5f1d6a335b1dc55948d2bfe
show more ...
|
378fe11d | 03-Feb-2022 |
Andrew Geissler <geissonator@yahoo.com> |
ups: do not power on if power status is bad
This will be a configurable feature that people can bring in via an optional package within the phosphor-state-manager recipe.
Tested: - Set CurrentPower
ups: do not power on if power status is bad
This will be a configurable feature that people can bring in via an optional package within the phosphor-state-manager recipe.
Tested: - Set CurrentPowerStatus to Good and verified app returned 0 with success log - Set CurrentPowerStatus to UninterruptiblePowerSupply and verified error was logged and non-zero rc was returned - Built full flash image and verified expected behavior in simulation
Signed-off-by: Andrew Geissler <geissonator@yahoo.com> Change-Id: I0fd5dc43b4476fd99f07d79169a71d102bb065e8
show more ...
|
2cf2a268 | 02-Feb-2022 |
Andrew Geissler <geissonator@yahoo.com> |
ups: watch for property changes
Monitor for any changes to the properties the chassis manger is interested in with the UPower interface.
Tested: - Changed the properties via busctl after the chassi
ups: watch for property changes
Monitor for any changes to the properties the chassis manger is interested in with the UPower interface.
Tested: - Changed the properties via busctl after the chassis manager was running and verified the chassis power status was updated correctly
Signed-off-by: Andrew Geissler <geissonator@yahoo.com> Change-Id: I13f23d33581c2339506f9c4739ce36bf07aef46e
show more ...
|
8b1f8620 | 28-Jan-2022 |
Andrew Geissler <geissonator@yahoo.com> |
ups: check state on startup and update power status
Look for a UPS (uninterruptible power supply) D-Bus object on chassis startup. If found, read its State and BatteryLevel properties to check the c
ups: check state on startup and update power status
Look for a UPS (uninterruptible power supply) D-Bus object on chassis startup. If found, read its State and BatteryLevel properties to check the condition of the power to the chassis. Update the CurrentPowerStatus property hosted by the chassis state manager based on this information.
Tested: - Utilized ibm-ups feature to start in debug mode so all properties could be manually set. Validated good path and bad paths for all properties and ensured CurrentPowerStatus was set as expected for each test.
Signed-off-by: Andrew Geissler <geissonator@yahoo.com> Change-Id: I064a9e7fecbed8f859828359a761877e0fa95251
show more ...
|
b2b3d9c2 | 26-Jan-2022 |
Andrew Geissler <geissonator@yahoo.com> |
pinhole: no power restore policy on pinhole reset
If the user initiated a BMC reboot via the pinhole reset then do not run the power restore policy. The pinhole reset is mostly used in debug situati
pinhole: no power restore policy on pinhole reset
If the user initiated a BMC reboot via the pinhole reset then do not run the power restore policy. The pinhole reset is mostly used in debug situations where a system is having issues. Disabling the power restore logic helps keep debug simpler.
Tested: - The overall pinhole reset logic has not been tested on hardware. - A variety of tests were done within simulation to validate the different paths in this series of patches - Full end to end testing will occur once all function is in place due to the complexities of physically toggling the pinhole reset
Signed-off-by: Andrew Geissler <geissonator@yahoo.com> Change-Id: Ia656b4872620b6a1fc6ba8f82c01f041d43378a2
show more ...
|
49e6713a | 26-Jan-2022 |
Andrew Geissler <geissonator@yahoo.com> |
pinhole: move power policy service to utils
Move the getProperty() function to utils and use it for all utility functions
Signed-off-by: Andrew Geissler <geissonator@yahoo.com> Change-Id: I3128d600
pinhole: move power policy service to utils
Move the getProperty() function to utils and use it for all utility functions
Signed-off-by: Andrew Geissler <geissonator@yahoo.com> Change-Id: I3128d6006dc5f72a579daaf168b9976ee5bcb2e8
show more ...
|
7ed36236 | 26-Jan-2022 |
Andrew Geissler <geissonator@yahoo.com> |
pinhole: do not log power loss error on pinhole
If the user did a pinhole reset to reboot the BMC and that happens to cause a chassis power off, do not log an error as this was user initiated. A sep
pinhole: do not log power loss error on pinhole
If the user did a pinhole reset to reboot the BMC and that happens to cause a chassis power off, do not log an error as this was user initiated. A separate error is logged to inform the user a pinhole reset was done to the system.
Signed-off-by: Andrew Geissler <geissonator@yahoo.com> Change-Id: Ib9059eeaa4325e7742ce024eed2f33923c6dea2a
show more ...
|
a2a7e122 | 26-Jan-2022 |
Andrew Geissler <geissonator@yahoo.com> |
pinhole: generate log when pinhole reset occurs
A pinhole reset is an important event for system admins and service personnel to be aware of. Create a log to record this event.
Signed-off-by: Andre
pinhole: generate log when pinhole reset occurs
A pinhole reset is an important event for system admins and service personnel to be aware of. Create a log to record this event.
Signed-off-by: Andrew Geissler <geissonator@yahoo.com> Change-Id: Ied4036e71655c61761e1a0cc46c881116a45685e
show more ...
|
9d4d0c91 | 26-Jan-2022 |
Andrew Geissler <geissonator@yahoo.com> |
pinhole: utility interface to create errors
This will be utilized in later commit to create an informational error when the pinhole reset is detected
Signed-off-by: Andrew Geissler <geissonator@yah
pinhole: utility interface to create errors
This will be utilized in later commit to create an informational error when the pinhole reset is detected
Signed-off-by: Andrew Geissler <geissonator@yahoo.com> Change-Id: I3ba0e9bbce306db29dcc70954ffafe90287e1a14
show more ...
|
98e64e6d | 25-Jan-2022 |
Andrew Geissler <geissonator@yahoo.com> |
pinhole: check for bmc reset reason
If firmware can not determine the reason for the BMC reset via the sysfs bootstatus value then look for the pinhole rest GPIO to see if that is the reason for the
pinhole: check for bmc reset reason
If firmware can not determine the reason for the BMC reset via the sysfs bootstatus value then look for the pinhole rest GPIO to see if that is the reason for the reboot.
See the following for more details: https://github.com/openbmc/docs/blob/master/designs/power-recovery.md
Signed-off-by: Andrew Geissler <geissonator@yahoo.com> Change-Id: If8e6e8cdc54dcb5f596d530c03c4676117fc8a47
show more ...
|
f8ae6a02 | 21-Jan-2022 |
Andrew Geissler <geissonator@yahoo.com> |
pinhole: move gpio function to utils
The function to read a GPIO is needed elsewhere in later reviews so move to the utility file.
Signed-off-by: Andrew Geissler <geissonator@yahoo.com> Change-Id:
pinhole: move gpio function to utils
The function to read a GPIO is needed elsewhere in later reviews so move to the utility file.
Signed-off-by: Andrew Geissler <geissonator@yahoo.com> Change-Id: I994d4a912c0abe9cae6cb02d22bf5be09581d332
show more ...
|
1fc48456 | 08-Feb-2022 |
Thang Tran <thuutran@amperecomputing.com> |
set requested state to off when power policy is always-off
Issue: Step 1: Set the power policy to "always-on" then turn off/on the chassis power. HOST shall be turned on after BMC has been rebooted.
set requested state to off when power policy is always-off
Issue: Step 1: Set the power policy to "always-on" then turn off/on the chassis power. HOST shall be turned on after BMC has been rebooted. Step 2: Set the power policy to "always-off" then turn off/on the chassis power. HOST shall be turn off after BMC has been rebooted. Step 3: Set the power policy to "previous" then turn off/on the chassis power. Wait until BMC has been rebooted. Expect: At step 3, HOST should be turned off due to BMC turned off HOST in step 2. Actual: At step 3, HOST is turned on.
Root cause: In step 1, BMC updated the requested state to "On". But in step 2, BMC did not update requested state to "Off", it is still "On". Therefore, in step 3, when BMC check the requested state (it is being "On"), BMC turns on the HOST.
Solution: When power policy is "always-off", set the requested state to "Off".
Tested: Repeat 3 above steps, at step 3, HOST state is "Off"
Signed-off-by: Thang Tran <thuutran@amperecomputing.com> Change-Id: Ib50de99ae935f2ccb777226d78848e33e5db9906
show more ...
|
ba2241c6 | 26-Oct-2021 |
Thang Tran <thuutran@amperecomputing.com> |
Support checking host status via GPIO pins
Currently, openBmc supports checking host status via ipmi and PLDM method. However, not all of platforms support IPMI and PLDM. Some platforms, like Ampere
Support checking host status via GPIO pins
Currently, openBmc supports checking host status via ipmi and PLDM method. However, not all of platforms support IPMI and PLDM. Some platforms, like Ampere Mt.Jade checks input GPIO to detect Host firmware boot status. This commit supports Dbus method to check host status via GPIO pins, it checks the value of "host0-ready/-n" pin to detect HOST is ready or not.
Tested: 1. Update GPIO pin to detect host0 ready to "host0-ready" in the device tree. 2. Enable host-gpios option and add executable and service files to openBmc software then flash to the board. 3. In the BMC console, check host-gpio-condition via command "busctl tree xyz.openbmc_project.State.HostCondition.Gpio" Result as below: `-/xyz `-/xyz/openbmc_project `-/xyz/openbmc_project/Gpios `-/xyz/openbmc_project/Gpios/host0 4. Turn on/off the HOST0 then check the host status via command "busctl get-property \ xyz.openbmc_project.State.HostCondition.Gpio \ /xyz/openbmc_project/Gpios/host0 \ xyz.openbmc_project.Condition.HostFirmware \ CurrentFirmwareCondition" Result as below: "xyz.openbmc_project.Condition.HostFirmware.\ FirmwareCondition.<Running/Off>"
Signed-off-by: Thang Tran <thuutran@amperecomputing.com> Change-Id: I20e38e76c5b70119f0c86e5b497d47453d7c5a6c
show more ...
|
68a8c31d | 03-Dec-2021 |
Andrew Geissler <geissonator@yahoo.com> |
host-restart: set restart reason on scheduling use
When the scheduling feature is the reason for a power on of the system, set the RestartCause property appropriately.
This property is used by othe
host-restart: set restart reason on scheduling use
When the scheduling feature is the reason for a power on of the system, set the RestartCause property appropriately.
This property is used by other software on some systems to guide partition behaviors in the host code.
Tested: - Utilize the scheduling feature to start a boot of the system and verified the RestartCause was set to ScheduledPowerOn
Signed-off-by: Andrew Geissler <geissonator@yahoo.com> Change-Id: I9f7e88ccf214d01a18c6641b0a016ff5576ccf5f
show more ...
|
744e5617 | 03-Dec-2021 |
Andrew Geissler <geissonator@yahoo.com> |
host-restart: set restart reason on power policy use
When the power policy is the reason for a power on of the system, set the RestartCause property appropriately.
This property is used by other so
host-restart: set restart reason on power policy use
When the power policy is the reason for a power on of the system, set the RestartCause property appropriately.
This property is used by other software on some systems to guide partition behaviors in the host code.
Tested: - Set boot policy to Power.RestorePolicy.Policy.AlwaysOn and Power.RestorePolicy.Policy.Restore, verified RestartCause updated to expected value.
Signed-off-by: Andrew Geissler <geissonator@yahoo.com> Change-Id: Ie3484d02e357e0b8f0c81d6abedda5b6f7d66cb5
show more ...
|
828f874a | 03-Dec-2021 |
Andrew Geissler <geissonator@yahoo.com> |
host-restart: ensure reason cleared on host stop
The host RestartCause property, similar to other host related properties, should be reset whenever the host is stopped and the system is brought down
host-restart: ensure reason cleared on host stop
The host RestartCause property, similar to other host related properties, should be reset whenever the host is stopped and the system is brought down.
Tested: - Verified on a power off of a system that the RestartCause was reset to xyz.openbmc_project.State.Host.RestartCause.Unknown
Signed-off-by: Andrew Geissler <geissonator@yahoo.com> Change-Id: Ia860409eca4051b4f6bd4dbb997cbb528a7b2b2b
show more ...
|
1e89e622 | 24-Oct-2021 |
Manojkiran Eda <manojkiran.eda@gmail.com> |
Add OWNERS file
Signed-off-by: Manojkiran Eda <manojkiran.eda@gmail.com> Change-Id: I93afbd391efd891dc41dd68916ea8bd543acc928 |
8583b3b9 | 06-Oct-2021 |
Patrick Williams <patrick@stwcx.xyz> |
catch exceptions as const
Signed-off-by: Patrick Williams <patrick@stwcx.xyz> Change-Id: I2afb405177268451e44f0aff8417777a00d292d9 |
2c36e5ad | 12-Jul-2021 |
Ben Tyner <ben.tyner@ibm.com> |
Chassis: Check for standby voltage regulator fault
When an unexpected power down is detected check the standby voltage regulator fault gpio for a latched fault event. If a regulator fault was detect
Chassis: Check for standby voltage regulator fault
When an unexpected power down is detected check the standby voltage regulator fault gpio for a latched fault event. If a regulator fault was detected then log the event.
Signed-off-by: Ben Tyner <ben.tyner@ibm.com> Change-Id: I98729118332c7a7785f9048f6ac7cfe1ce882bb6
show more ...
|