xref: /openbmc/phosphor-power/phosphor-power-sequencer/docs/pgood_faults.md (revision 27ae70b8219cd292f3d8d9b3d5ff2a9ee18b1f37)
1*27ae70b8SShawn McCarney# Power Good Faults
2*27ae70b8SShawn McCarney
3*27ae70b8SShawn McCarney## Overview
4*27ae70b8SShawn McCarney
5*27ae70b8SShawn McCarneyThe power sequencer device provides a chassis power good (pgood) signal. This
6*27ae70b8SShawn McCarneyindicates that all of the main (non-standby) voltage rails are powered on.
7*27ae70b8SShawn McCarney
8*27ae70b8SShawn McCarneyIf the chassis pgood state is false when it should be true, a chassis power good
9*27ae70b8SShawn McCarney(pgood) fault has occurred.
10*27ae70b8SShawn McCarney
11*27ae70b8SShawn McCarney## Pgood fault while powering on the system
12*27ae70b8SShawn McCarney
13*27ae70b8SShawn McCarneyWhen the power sequencer device is powering on the main voltage rails in order,
14*27ae70b8SShawn McCarneyone of the rails may fail to power on. This is often due to a hardware problem.
15*27ae70b8SShawn McCarney
16*27ae70b8SShawn McCarneyWhen a voltage rail fails to power on, the power sequencer device may
17*27ae70b8SShawn McCarneyimmediately indicate an error. However, the device may instead wait indefinitely
18*27ae70b8SShawn McCarneyfor the rail to power on. In both cases the chassis pgood signal never changes
19*27ae70b8SShawn McCarneyto true.
20*27ae70b8SShawn McCarney
21*27ae70b8SShawn McCarney## Pgood fault after the system was powered on
22*27ae70b8SShawn McCarney
23*27ae70b8SShawn McCarneyA pgood fault can occur after a system has been powered on. The system may have
24*27ae70b8SShawn McCarneybeen successfully running for days or months.
25*27ae70b8SShawn McCarney
26*27ae70b8SShawn McCarneyA voltage rail may suddenly power off or stop providing the expected level of
27*27ae70b8SShawn McCarneyvoltage. This could occur if the voltage regulator stops working or if it shuts
28*27ae70b8SShawn McCarneyitself off due to exceeding a temperature/voltage/current limit.
29*27ae70b8SShawn McCarney
30*27ae70b8SShawn McCarneyThe power sequencer device will detect that the voltage rail has failed. The
31*27ae70b8SShawn McCarneydevice will change the state of the chassis pgood signal to false. The device
32*27ae70b8SShawn McCarneymay also power off several other related voltage rails, depending on how the
33*27ae70b8SShawn McCarneyhardware is configured.
34*27ae70b8SShawn McCarney
35*27ae70b8SShawn McCarney## Pgood fault handling
36*27ae70b8SShawn McCarney
37*27ae70b8SShawn McCarney`phosphor-power-sequencer` detects a pgood fault by monitoring the chassis pgood
38*27ae70b8SShawn McCarneysignal:
39*27ae70b8SShawn McCarney
40*27ae70b8SShawn McCarney- Powering on chassis: pgood signal never changes to true.
41*27ae70b8SShawn McCarney- Chassis was powered on: pgood signal changes from true to false.
42*27ae70b8SShawn McCarney
43*27ae70b8SShawn McCarneyWhen a pgood fault is detected, `phosphor-power-sequencer` will perform the
44*27ae70b8SShawn McCarneyfollowing steps:
45*27ae70b8SShawn McCarney
46*27ae70b8SShawn McCarney- Use information from the power sequencer device to determine the cause of the
47*27ae70b8SShawn McCarney  fault.
48*27ae70b8SShawn McCarney- Log an error with information about the fault.
49*27ae70b8SShawn McCarney- If this is a single chassis system:
50*27ae70b8SShawn McCarney  - The system will be [powered off](powering_off.md).
51*27ae70b8SShawn McCarney- If this is a multiple chassis system:
52*27ae70b8SShawn McCarney  - Wait a short period of time, and then check if all the other chassis that
53*27ae70b8SShawn McCarney    were powered on are also experiencing a pgood fault. If so, check if any
54*27ae70b8SShawn McCarney    chassis is experiencing a brownout or blackout. This determines whether this
55*27ae70b8SShawn McCarney    is a chassis-specific problem or a system-wide problem due to a
56*27ae70b8SShawn McCarney    [Power Loss](power_loss.md).
57*27ae70b8SShawn McCarney  - If this is a chassis-specific problem, add the inventory path of the chassis
58*27ae70b8SShawn McCarney    to the error log. This may result in hardware isolation, which will cause
59*27ae70b8SShawn McCarney    the `Enabled` property of the chassis to be false.
60*27ae70b8SShawn McCarney  - The system will be powered [off](powering_off.md) and then
61*27ae70b8SShawn McCarney    [on](powering_on.md) again.
62*27ae70b8SShawn McCarney  - Chassis with an `Enabled` value of false will **not** be powered back on.
63*27ae70b8SShawn McCarney
64*27ae70b8SShawn McCarneySee [Chassis Status](chassis_status.md) for more information on the `Enabled`
65*27ae70b8SShawn McCarneyproperty.
66*27ae70b8SShawn McCarney
67*27ae70b8SShawn McCarneyNote that when a pgood error happens **during** a power on attempt, the
68*27ae70b8SShawn McCarney`phosphor-chassis-state-manager` application handles the power off/power cycle.
69*27ae70b8SShawn McCarneyWhen the pgood error happens **after** the system was powered on, the
70*27ae70b8SShawn McCarney`phosphor-power-sequencer` application handles the power off/power cycle. This
71*27ae70b8SShawn McCarneyis due to the complex service file relationships that occur during a power on
72*27ae70b8SShawn McCarneyattempt.
73*27ae70b8SShawn McCarney
74*27ae70b8SShawn McCarney## Determining the cause of a pgood fault
75*27ae70b8SShawn McCarney
76*27ae70b8SShawn McCarneyIt is very helpful to determine which voltage rail caused a pgood fault. That
77*27ae70b8SShawn McCarneydetermines what hardware potentially needs to be replaced.
78*27ae70b8SShawn McCarney
79*27ae70b8SShawn McCarneyDetermining the correct rail requires the following:
80*27ae70b8SShawn McCarney
81*27ae70b8SShawn McCarney- The power sequencer device type is supported by `phosphor-power-sequencer`.
82*27ae70b8SShawn McCarney- A [JSON configuration file](config_file/README.md) is defined for the system.
83*27ae70b8SShawn McCarney
84*27ae70b8SShawn McCarneyIf those requirements are not met, a general pgood error will be logged.
85*27ae70b8SShawn McCarney
86*27ae70b8SShawn McCarneyIf those requirements are met, `phosphor-power-sequencer` will attempt to
87*27ae70b8SShawn McCarneydetermine which voltage rail caused the chassis pgood fault. The following
88*27ae70b8SShawn McCarneymethods are supported in the JSON configuration file:
89*27ae70b8SShawn McCarney
90*27ae70b8SShawn McCarney- Read a GPIO from the power sequencer device
91*27ae70b8SShawn McCarney- Check the PMBus STATUS_VOUT command value
92*27ae70b8SShawn McCarney- Compare the PMBus READ_VOUT value to the PMBus VOUT_UV_FAULT_LIMIT value
93*27ae70b8SShawn McCarney
94*27ae70b8SShawn McCarneyMultiple methods might need to be used on the same rail. For example, the PMBus
95*27ae70b8SShawn McCarneySTATUS_VOUT error bits might be set for a pgood fault after the system powered
96*27ae70b8SShawn McCarneyon, but they might not be set during a power on attempt because the power
97*27ae70b8SShawn McCarneysequencer is waiting indefinitely for the rail to power on.
98*27ae70b8SShawn McCarney
99*27ae70b8SShawn McCarneySee the [rail](config_file/rail.md) object in the configuration file for more
100*27ae70b8SShawn McCarneyinformation.
101*27ae70b8SShawn McCarney
102*27ae70b8SShawn McCarneyIf a specific voltage rail is found, an error is logged against that rail.
103*27ae70b8SShawn McCarney
104*27ae70b8SShawn McCarneyIf the voltage rail is from the power supplies, and the `phosphor-power-supply`
105*27ae70b8SShawn McCarneyapplication found a power supply error, then the power supply error is logged as
106*27ae70b8SShawn McCarneythe cause of the pgood fault.
107*27ae70b8SShawn McCarney
108*27ae70b8SShawn McCarneyIf no voltage rail is found, a general pgood error is logged.
109