xref: /openbmc/phosphor-power/phosphor-regulators/docs/phase_fault_monitoring.md (revision 80689db1ed2f4677bd3ea3fc3b74f26900bf44d9)
1*80689db1SShawn McCarney# Phase Fault Monitoring
2*80689db1SShawn McCarney
3*80689db1SShawn McCarney## Overview
4*80689db1SShawn McCarney
5*80689db1SShawn McCarneySome voltage regulators contain redundant phases. If a redundant phase fails,
6*80689db1SShawn McCarneythe regulator can continue functioning normally. However redundancy has been
7*80689db1SShawn McCarneylost, and the regulator may fail if another phase fails.
8*80689db1SShawn McCarney
9*80689db1SShawn McCarneyVoltage regulators can be monitored for redundant phase faults. If a fault is
10*80689db1SShawn McCarneydetected, an error is logged on the BMC.
11*80689db1SShawn McCarney
12*80689db1SShawn McCarneyWhen [regulator monitoring](monitoring.md) is enabled, the `phosphor-regulators`
13*80689db1SShawn McCarneyapplication checks for redundant phase faults every 15 seconds.
14*80689db1SShawn McCarney
15*80689db1SShawn McCarneyA phase fault must be detected two consecutive times (15 seconds apart) before
16*80689db1SShawn McCarneyan error is logged. This provides "de-glitching" to ignore transient hardware
17*80689db1SShawn McCarneyproblems.
18*80689db1SShawn McCarney
19*80689db1SShawn McCarneyA phase fault error will only be logged for a regulator once per system boot.
20*80689db1SShawn McCarney
21*80689db1SShawn McCarney## How phase fault detection is defined
22*80689db1SShawn McCarney
23*80689db1SShawn McCarneyPhase fault detection is defined for a voltage regulator using the
24*80689db1SShawn McCarney[phase_fault_detection](config_file/phase_fault_detection.md) object in the
25*80689db1SShawn McCarney[JSON config file](config_file/README.md).
26*80689db1SShawn McCarney
27*80689db1SShawn McCarneyThis object specifies actions to run, such as
28*80689db1SShawn McCarney[i2c_compare_byte](config_file/i2c_compare_byte.md).
29*80689db1SShawn McCarney
30*80689db1SShawn McCarney## Error handling
31*80689db1SShawn McCarney
32*80689db1SShawn McCarneyIf a different type of error occurs while detecting phase faults in a regulator:
33*80689db1SShawn McCarney
34*80689db1SShawn McCarney- The error will be logged. If the same error occurs repeatedly on regulator, it
35*80689db1SShawn McCarney  will only be logged once per system boot.
36*80689db1SShawn McCarney- Any remaining actions for the regulator will be skipped.
37*80689db1SShawn McCarney- Phase fault detection will continue with the next regulator.
38*80689db1SShawn McCarney- Phase fault detection will be attempted again for this regulator during the
39*80689db1SShawn McCarney  next monitoring cycle.
40