32453e9b | 15-Dec-2021 |
Brandon Wyman <bjwyman@gmail.com> |
psu-ng: Continue reading after readFail
Update the PMBus::read() function to allow for not creating journal trace and elog, but default to continuing to trace and elog, the previous behavior.
If we
psu-ng: Continue reading after readFail
Update the PMBus::read() function to allow for not creating journal trace and elog, but default to continuing to trace and elog, the previous behavior.
If we reach the limit of read failures that results in a communication error log, continue to read, but stop logging failures.
If communication restores, we may be able to detect what caused the read failure, or otherwise detect or clear new faults.
Change-Id: If59b86211ab54c31248ede78f8f117b607298923 Signed-off-by: Brandon Wyman <bjwyman@gmail.com>
show more ...
|
b70eae9a | 20-Jan-2022 |
Adriana Kobylak <anoo@us.ibm.com> |
psu-ng: Check if mismatched PSU is supported
The function that checks that all PSUs have the same model currently calls out the first mismatched PSU. If the mismatched PSU is listed in the supported
psu-ng: Check if mismatched PSU is supported
The function that checks that all PSUs have the same model currently calls out the first mismatched PSU. If the mismatched PSU is listed in the supported configurations and the base model is not, the error log may cause confusion because it would indicate in the callout data to replace the PSU that it is supposed to be supported instead of the one that it is not supported on that system.
Therefore check the supported configurations to determine which PSU to callout in case of a mismatch.
Tested: On p10bmc with ps0 model 2B1D (no supported) and ps1 model 2B1E (supported): - Before change: "ACTUAL_MODEL": "2B1E", "CALLOUT_INVENTORY_PATH": "/xyz/openbmc_project/inventory/system/chassis/motherboard/powersupply1", "EXPECTED_MODEL": "2B1D", - After change: "ACTUAL_MODEL": "2B1D", "CALLOUT_INVENTORY_PATH": "/xyz/openbmc_project/inventory/system/chassis/motherboard/powersupply0", "EXPECTED_MODEL": "2B1E",
Change-Id: I0b2d487e12f55e08a93e77b6c569726dde9d4e68 Signed-off-by: Adriana Kobylak <anoo@us.ibm.com>
show more ...
|
c2906f47 | 21-Dec-2021 |
Brandon Wyman <bjwyman@gmail.com> |
psu-ng: De-glitch all faults
Use DEGLITCH_LIMIT to determine all the faults. If a fault bit is on, do not consider that a fault until it is seen at least DEGLITCH_LIMIT times. With DEGLITCH_LIMIT se
psu-ng: De-glitch all faults
Use DEGLITCH_LIMIT to determine all the faults. If a fault bit is on, do not consider that a fault until it is seen at least DEGLITCH_LIMIT times. With DEGLITCH_LIMIT set to 3, the monitor would need to see a fault bit on 3 times in a row before indicating that the power supply has that fault.
This was done earlier for the PGOOD fault detection.
Change-Id: I918c2fcdd1d90ae253ab268bd04aa7a0da0208b8 Signed-off-by: Brandon Wyman <bjwyman@gmail.com>
show more ...
|
d8fc0685 | 11-Jan-2022 |
Jim Wright <jlwright@us.ibm.com> |
pseq: Setup UCD90320 device GPIOs
Setup GPIOs for UCD90320 device. GPIOs will be used for power good fault isolation. Use libgpiod.
Signed-off-by: Jim Wright <jlwright@us.ibm.com> Change-Id: I9c104
pseq: Setup UCD90320 device GPIOs
Setup GPIOs for UCD90320 device. GPIOs will be used for power good fault isolation. Use libgpiod.
Signed-off-by: Jim Wright <jlwright@us.ibm.com> Change-Id: I9c104e3fddcb1847a95f6fb67bf6c139745f2d0c
show more ...
|
82affd94 | 24-Nov-2021 |
Brandon Wyman <bjwyman@gmail.com> |
psu-ng: Clear faults when voltage back in range
If the last read voltage (via READ_VIN) was below the minimum and now it is back in a valid range (100 or 200 volt range valid), clear all the faults
psu-ng: Clear faults when voltage back in range
If the last read voltage (via READ_VIN) was below the minimum and now it is back in a valid range (100 or 200 volt range valid), clear all the faults to allow for re-detection of faults and logging of new errors.
Trace if INPUT_FAULT_WARN or VIN_UV clear. We should not expect to see that without sending a CLEAR_FAULTS command (or a power cycle).
Tested: Rainier 2S2U real hardware. ePDU outlet off/on allows re-detection of injected CML fault. - input fault, vin_uv fault, pgood/off fault. - repeat shows faults cleared, and new faults logged. Simulator pgood fault, then low voltage followed by good voltage. Verify simulator can re-detect faults after voltage back in range. Simulator fake input fault/warn on, then off and other fault on. - verified tracing input going off without clear faults sent. Simulator fake input fault/warn on, then no faults. - verified tracing input going off without clear faults sent.
Change-Id: Ic8022cf137978ff660680e9680f778853cbecf0d Signed-off-by: Brandon Wyman <bjwyman@gmail.com>
show more ...
|
ccea2d2b | 10-Dec-2021 |
Jim Wright <jlwright@us.ibm.com> |
pseq: Add setPowerSupplyError D-Bus method
Add a setPowerSupplyError method to the org.openbmc.control.Power D-Bus interface. When a power supply error is detected which is severe enough to cause a
pseq: Add setPowerSupplyError D-Bus method
Add a setPowerSupplyError method to the org.openbmc.control.Power D-Bus interface. When a power supply error is detected which is severe enough to cause a power good failure, that error should be used in preference to the power good error. Add a D-Bus method to allow this to be communicated between the applications. The parameter passed should be the power supply error to log, for example "xyz.openbmc_project.Power.PowerSupply.Error.PSKillFault".
Signed-off-by: Jim Wright <jlwright@us.ibm.com> Change-Id: I8500ace4638236dda2d7ff4361b61efa30a50ac1
show more ...
|
f087f475 | 21-Dec-2021 |
Brandon Wyman <bjwyman@gmail.com> |
psu-ng: Refactor VIN_UV fault detection
Split off code checking for VIN_UV fault in STATUS_WORD to its own function.
Change-Id: Ifd07b6958885ed19c7611e10e343d1a5f10ec684 Signed-off-by: Brandon Wyma
psu-ng: Refactor VIN_UV fault detection
Split off code checking for VIN_UV fault in STATUS_WORD to its own function.
Change-Id: Ifd07b6958885ed19c7611e10e343d1a5f10ec684 Signed-off-by: Brandon Wyman <bjwyman@gmail.com>
show more ...
|
d5d9a225 | 21-Dec-2021 |
Brandon Wyman <bjwyman@gmail.com> |
psu-ng: Refactor fan fault detection
Split off code checking for fan fault/warning in STATUS_WORD to its own function.
Change-Id: I49b6dc2d62b6ca39a564262a4745aa5ed25c14eb Signed-off-by: Brandon Wy
psu-ng: Refactor fan fault detection
Split off code checking for fan fault/warning in STATUS_WORD to its own function.
Change-Id: I49b6dc2d62b6ca39a564262a4745aa5ed25c14eb Signed-off-by: Brandon Wyman <bjwyman@gmail.com>
show more ...
|
08378784 | 21-Dec-2021 |
Brandon Wyman <bjwyman@gmail.com> |
psu-ng: Refactor VOUT_UV_FAULT detection
Split off code checking for VOUT_UV fault in STATUS_WORD to its own function.
Change-Id: I3b5e898a7d4f1ad21317c66b7fcb97c211581dcd Signed-off-by: Brandon Wy
psu-ng: Refactor VOUT_UV_FAULT detection
Split off code checking for VOUT_UV fault in STATUS_WORD to its own function.
Change-Id: I3b5e898a7d4f1ad21317c66b7fcb97c211581dcd Signed-off-by: Brandon Wyman <bjwyman@gmail.com>
show more ...
|
a00e7300 | 21-Dec-2021 |
Brandon Wyman <bjwyman@gmail.com> |
psu-ng: Refactor IOUT_OC_FAULT detection
Split off code checking for IOUT_OC fault in STATUS_WORD to its own function.
Change-Id: I4925149b0b1f9bce2400c83eb2808aeba0f7d1cc Signed-off-by: Brandon Wy
psu-ng: Refactor IOUT_OC_FAULT detection
Split off code checking for IOUT_OC fault in STATUS_WORD to its own function.
Change-Id: I4925149b0b1f9bce2400c83eb2808aeba0f7d1cc Signed-off-by: Brandon Wyman <bjwyman@gmail.com>
show more ...
|
c2c87131 | 21-Dec-2021 |
Brandon Wyman <bjwyman@gmail.com> |
psu-ng: Refactor VOUT_OV_FAULT detection
Split off code checking for VOUT_OV fault in STATUS_WORD to its own function.
Change-Id: Id5fef1a3830ff4a60ca235bbf83b3f99caea5986 Signed-off-by: Brandon Wy
psu-ng: Refactor VOUT_OV_FAULT detection
Split off code checking for VOUT_OV fault in STATUS_WORD to its own function.
Change-Id: Id5fef1a3830ff4a60ca235bbf83b3f99caea5986 Signed-off-by: Brandon Wyman <bjwyman@gmail.com>
show more ...
|
e3b0bb01 | 21-Dec-2021 |
Brandon Wyman <bjwyman@gmail.com> |
psu-ng: Refactor input fault detection
Split off input fault checking of STATUS_WORD into its own function.
Change-Id: I72a299ec1e905c9d59460b37f594997eee124e27 Signed-off-by: Brandon Wyman <bjwyma
psu-ng: Refactor input fault detection
Split off input fault checking of STATUS_WORD into its own function.
Change-Id: I72a299ec1e905c9d59460b37f594997eee124e27 Signed-off-by: Brandon Wyman <bjwyman@gmail.com>
show more ...
|
c220343c | 21-Dec-2021 |
Brandon Wyman <bjwyman@gmail.com> |
psu-ng: Refactor CML fault detection
Split off CML fault checking of STATUS_WORD into its own function.
Change-Id: I845fe0dbcc86f085057bd5128303dde04cd9057f Signed-off-by: Brandon Wyman <bjwyman@gm
psu-ng: Refactor CML fault detection
Split off CML fault checking of STATUS_WORD into its own function.
Change-Id: I845fe0dbcc86f085057bd5128303dde04cd9057f Signed-off-by: Brandon Wyman <bjwyman@gmail.com>
show more ...
|
52cb3f28 | 21-Dec-2021 |
Brandon Wyman <bjwyman@gmail.com> |
psu-ng: Refactor temperature fault detection
Split off temperature fault checking of STATUS_WORD into its own function.
Change-Id: Ia7dc08af12647cc7ca76356c9e9d2b75e2e95f56 Signed-off-by: Brandon W
psu-ng: Refactor temperature fault detection
Split off temperature fault checking of STATUS_WORD into its own function.
Change-Id: Ia7dc08af12647cc7ca76356c9e9d2b75e2e95f56 Signed-off-by: Brandon Wyman <bjwyman@gmail.com>
show more ...
|
993b554f | 21-Dec-2021 |
Brandon Wyman <bjwyman@gmail.com> |
psu-ng: Refactor PGOOD and UNIT_IS_OFF fault
Split the code for detecting PGOOD and/or UNIT_IS_OFF fault into its own private member function.
The analyze() function is getting a bit long and hard
psu-ng: Refactor PGOOD and UNIT_IS_OFF fault
Split the code for detecting PGOOD and/or UNIT_IS_OFF fault into its own private member function.
The analyze() function is getting a bit long and hard to read.
Change-Id: I48771a5d4e8991ce37b54bd6ad4c3e938924418e Signed-off-by: Brandon Wyman <bjwyman@gmail.com>
show more ...
|
6c2ac394 | 21-Dec-2021 |
Brandon Wyman <bjwyman@gmail.com> |
psu-ng: Refactor MFR fault detection
Move handling of MFR fault to its own analyzeMFRFault() function.
This will handle checking if the MFR fault bit is on in STATUS_WORD, and determinine what powe
psu-ng: Refactor MFR fault detection
Move handling of MFR fault to its own analyzeMFRFault() function.
This will handle checking if the MFR fault bit is on in STATUS_WORD, and determinine what power supply specific meaning any STATUS_MFR_SPECIFIC bits mean by calling determineMFRFault().
The analyze() function is getting a bit long and hard to read.
Change-Id: I401ebcf11943099385044081518a27511075fa94 Signed-off-by: Brandon Wyman <bjwyman@gmail.com>
show more ...
|
e3f7ad23 | 21-Dec-2021 |
Brandon Wyman <bjwyman@gmail.com> |
psu-ng: Refactor clearing fault member variables
Consolidated the copy/paste of clearing the fault member variables into a helper function that both analyze() and clearFaults() can use.
Change-Id:
psu-ng: Refactor clearing fault member variables
Consolidated the copy/paste of clearing the fault member variables into a helper function that both analyze() and clearFaults() can use.
Change-Id: Ib56718b0d4cc36edd000b9ba1f52fb42047e2a8c Signed-off-by: Brandon Wyman <bjwyman@gmail.com>
show more ...
|
925c0263 | 21-Dec-2021 |
Brandon Wyman <bjwyman@gmail.com> |
psu-ng: Change pgoodFault type to size_t
The Misc Guidelines section of the coding conventions indicate that we should "Always use size_t or ssize_t for things that are sizes, counts, etc. ...."
ht
psu-ng: Change pgoodFault type to size_t
The Misc Guidelines section of the coding conventions indicate that we should "Always use size_t or ssize_t for things that are sizes, counts, etc. ...."
https: //github.com/openbmc/docs/blob/master/cpp-style-and-conventions.md Change-Id: I23eba141c00e138477e008a40962f0c1af94bb51 Signed-off-by: Brandon Wyman <bjwyman@gmail.com>
show more ...
|
391a0690 | 08-Dec-2021 |
Brandon Wyman <bjwyman@gmail.com> |
psu-ng: gtest cleanup expectation warnings
A number of the tests are missing various EXPECT_CALL statements that result in very verbose testlog.txt output. This becomes especially problematic when a
psu-ng: gtest cleanup expectation warnings
A number of the tests are missing various EXPECT_CALL statements that result in very verbose testlog.txt output. This becomes especially problematic when a test fails, as there are pages of output to look for to narrow down what failed where and why.
Adding in the EXPECT_CALL statements that should be there, such as findHwmonDir, the reading of "in1_input" as part of the fault clearing, etc.
Change-Id: I9f2f88622ad7b682461069df980a50b0b13c44a6 Signed-off-by: Brandon Wyman <bjwyman@gmail.com>
show more ...
|
d8a8617a | 08-Dec-2021 |
Jim Wright <jlwright@us.ibm.com> |
pseq: Find and parse configuaration file
Use the compatible systems information from entity manager to find the correct system specific configuration file. Then parse the rail and pin data from the
pseq: Find and parse configuaration file
Use the compatible systems information from entity manager to find the correct system specific configuration file. Then parse the rail and pin data from the configuration file.
Signed-off-by: Jim Wright <jlwright@us.ibm.com> Change-Id: Ie7a13bece9c6cc1246cca733e2275b084bac95c8
show more ...
|
39ea02bc | 23-Nov-2021 |
Brandon Wyman <bjwyman@gmail.com> |
psu-ng: Add in handling of specific MFR faults
Add in a function to determine what the various bits in statusMFR may be indicating for a fault, based on the type of power supply (device driver bound
psu-ng: Add in handling of specific MFR faults
Add in a function to determine what the various bits in statusMFR may be indicating for a fault, based on the type of power supply (device driver bound).
Add in PS_Kill, 12Vcs, and 12V CS faults for IBM power supply types.
Add in creating error logs for PS_Kill, 12Vcs, and 12V CS faults. The 12Vcs and 12V CS faults can essentially be treated the same as VOUT_UV faults (same error type, same call out).
Tested: Verified no PS_Kill, 12Vcs, or 12V CS fault on normal Rainier 2S4U
Simulated PS_Kill fault: MFR fault: STATUS_WORD = 0x1840 STATUS_MFR_SPECIFIC = 0x10
Simulated 12Vcs fault: PGOOD fault: STATUS_WORD = 0x1840, STATUS_MFR_SPECIFIC = 0x40 MFR fault: STATUS_WORD = 0x1840 STATUS_MFR_SPECIFIC = 0x40
Simulated 12V CS fault/warning: MFR fault: STATUS_WORD = 0x1000 STATUS_MFR_SPECIFIC = 0x80
Change-Id: Ie89a58836ecec86dfa2e124eb6ab03e9dccce929 Signed-off-by: Brandon Wyman <bjwyman@gmail.com>
show more ...
|
a169b0f9 | 07-Dec-2021 |
Brandon Wyman <bjwyman@gmail.com> |
psu-ng: IBM FN goes to SparePartNumber
When reading the IBM FN (FRU_NUMBER), add it to the SparePartNumber asset properties in the D-Bus inventory.
Tested: root@p10bmc:~# busctl get-property xyz.op
psu-ng: IBM FN goes to SparePartNumber
When reading the IBM FN (FRU_NUMBER), add it to the SparePartNumber asset properties in the D-Bus inventory.
Tested: root@p10bmc:~# busctl get-property xyz.openbmc_project.Inventory.Manager \ > /xyz/openbmc_project/inventory/system/chassis/motherboard/powersupply0 \ > xyz.openbmc_project.Inventory.Decorator.Asset \ > PartNumber s "3FP210" root@p10bmc:~# root@p10bmc:~# busctl get-property xyz.openbmc_project.Inventory.Manager \ > /xyz/openbmc_project/inventory/system/chassis/motherboard/powersupply1 \ > xyz.openbmc_project.Inventory.Decorator.Asset \ > PartNumber s "3FP210" root@p10bmc:~# root@p10bmc:~# busctl get-property xyz.openbmc_project.Inventory.Manager \ > /xyz/openbmc_project/inventory/system/chassis/motherboard/powersupply0 \ > xyz.openbmc_project.Inventory.Decorator.Asset \ > SparePartNumber s "" root@p10bmc:~# root@p10bmc:~# busctl get-property xyz.openbmc_project.Inventory.Manager \ > /xyz/openbmc_project/inventory/system/chassis/motherboard/powersupply1 \ > xyz.openbmc_project.Inventory.Decorator.Asset \ > SparePartNumber s "" root@p10bmc:~# root@p10bmc:~# # patch needed in /usr root@p10bmc:~# mkdir -p /tmp/persist/usr root@p10bmc:~# mkdir -p /tmp/persist/work/usr root@p10bmc:~# mount -t overlay -o lowerdir=/usr,upperdir=/tmp/persist/usr,workdir=/tmp/persist/work/usr overlay /usr root@p10bmc:~# md5sum /usr/bin/phosphor-psu-monitor /tmp/phosphor-psu-monitor ac1a50698a63e53dd3819b9f3c78c378 /usr/bin/phosphor-psu-monitor 4a2806d1a3494d1dd7176cd7b9dadf1a /tmp/phosphor-psu-monitor root@p10bmc:~# mv /tmp/phosphor-psu-monitor /usr/bin/phosphor-psu-monitor root@p10bmc:~# md5sum /usr/bin/phosphor-psu-monitor /tmp/phosphor-psu-monitor 4a2806d1a3494d1dd7176cd7b9dadf1a /usr/bin/phosphor-psu-monitor md5sum: can't open '/tmp/phosphor-psu-monitor': No such file or directory root@p10bmc:~# systemctl daemon-reload root@p10bmc:~# systemctl restart phosphor-psu-monitor.service root@p10bmc:~# busctl get-property xyz.openbmc_project.Inventory.Manager \ > /xyz/openbmc_project/inventory/system/chassis/motherboard/powersupply0 \ > xyz.openbmc_project.Inventory.Decorator.Asset \ > PartNumber s "3FP210" root@p10bmc:~# root@p10bmc:~# busctl get-property xyz.openbmc_project.Inventory.Manager \ > /xyz/openbmc_project/inventory/system/chassis/motherboard/powersupply1 \ > xyz.openbmc_project.Inventory.Decorator.Asset \ > PartNumber s "3FP210" root@p10bmc:~# root@p10bmc:~# busctl get-property xyz.openbmc_project.Inventory.Manager \ > /xyz/openbmc_project/inventory/system/chassis/motherboard/powersupply0 \ > xyz.openbmc_project.Inventory.Decorator.Asset \ > SparePartNumber s "3FP211" root@p10bmc:~# root@p10bmc:~# busctl get-property xyz.openbmc_project.Inventory.Manager \ > /xyz/openbmc_project/inventory/system/chassis/motherboard/powersupply1 \ > xyz.openbmc_project.Inventory.Decorator.Asset \ > SparePartNumber s "3FP211" root@p10bmc:~#
Change-Id: I4aaa906f576894f62fa36083c40c89d935d646a8 Signed-off-by: Brandon Wyman <bjwyman@gmail.com>
show more ...
|
56ae78e1 | 01-Dec-2021 |
Jim Wright <jlwright@us.ibm.com> |
pseq: Retrieve compatible systems from EM
Retrieve compatible systems value from entity manager using appropriate D-Bus methods. Will be used to load correct configuration file containing system ra
pseq: Retrieve compatible systems from EM
Retrieve compatible systems value from entity manager using appropriate D-Bus methods. Will be used to load correct configuration file containing system rail and pin information.
Signed-off-by: Jim Wright <jlwright@us.ibm.com> Change-Id: I040b4fcb182f752e5ac1a3c930b53464eebef2da
show more ...
|
9ba38235 | 16-Nov-2021 |
Adriana Kobylak <anoo@us.ibm.com> |
psu-ng: Run validation when PSU is plugged
Subscribe to the Present inventory property so that when a PSU becomes present (it's plugged into the system), it triggers the PSU validation check so that
psu-ng: Run validation when PSU is plugged
Subscribe to the Present inventory property so that when a PSU becomes present (it's plugged into the system), it triggers the PSU validation check so that the user can know if the new PSU(s) is supported on the system instead of needing to issue a power on to run the validation.
Tested: At BMC Ready state, set the Present property on powersupply1 to false, then true, and check that the validation ran every time the Present property was set to true. Same when changing the Present property for powersupply0.
Change-Id: I14dc7d5902871284c9c099e81b45e78e4abf83bc Signed-off-by: Adriana Kobylak <anoo@us.ibm.com>
show more ...
|
06ca4590 | 06-Dec-2021 |
Brandon Wyman <bjwyman@gmail.com> |
psu-ng: Add DEGLITCH_LIMIT, deglitch pgoodFault
While the power supply should not arbitrarily report a PGOOD fault, and then turn it back off, there is a perception that this is indeed possible, a g
psu-ng: Add DEGLITCH_LIMIT, deglitch pgoodFault
While the power supply should not arbitrarily report a PGOOD fault, and then turn it back off, there is a perception that this is indeed possible, a glitch of some sort.
To avoid possibly logging an error for an erroneous fault reporting, make sure the fault is reported more than once before considering it to be a true fault (deglitch the signal).
Tested: Real Rainier 2S2U: Verify tracing PGOOD faults seen and cleared, no error logged Verify PGOOD/OFF error logged when manually set ON_OFF_CONFIG & OPERATION. Verify deglitched PGOOD again on restart service (ON_OFF_CONFIG reset).
Change-Id: I54f775004d2e363cff21ff0512bd9283408f1f72 Signed-off-by: Brandon Wyman <bjwyman@gmail.com>
show more ...
|