#
41d507e5 |
| 05-Oct-2021 |
Shantappa Teekappanavar <sbteeks@yahoo.com> |
Watchdog timeout support in SBE boot window
Added support to handle SBE boot failure when watchdog times out in the SBE boot window. FFDC information from SBE is captured using libphal provided API,
Watchdog timeout support in SBE boot window
Added support to handle SBE boot failure when watchdog times out in the SBE boot window. FFDC information from SBE is captured using libphal provided API, and the SBE specific PEL for a valid FFDC is created. In case the error is related to SBE timeout or no FFDC data then SBE dump to capture additional debug data is initiated.
Tested: verified PEL log root@p10bmc:~# peltool -l { "0x50000332": { "SRC": "BD123504", "Message": "timeout reported during SBE boot process", "PLID": "0x50000332", "CreatorID": "BMC", "Subsystem": "Processor Chip Cache", "Commit Time": "10/04/2021 18:25:27", "Sev": "Unrecoverable Error", "CompID": "0x3500" } }
- Verified SBE dump was collected Steps used: 1. obmcutil poweroff 2. istep -s0 3. systemctl start org.open_power.Dump.Manager.service 4. systemctl start openpower-debug-collector-watchdog@0.service 5. Check journal log to see SBE dump requested, dump entry created and the dump is completed journalctl -f -t watchdog_timeout 6. Verify the SBE dump: ls /var/lib/phosphor-debug-collector/sbedump/<dump-entry-id>
- Verified Hostboot dump was collected Steps Used: 1. obmcutil poweroff 2. istep -s0..6 3. systemctl start org.open_power.Dump.Manager.service 4. systemctl start openpower-debug-collector-watchdog@0.service 5. Check journal log to see Hostboot dump requested, dump entry created and the dump is completed journalctl -f -t watchdog_timeout 6. Verify the SBE dump: ls /var/lib/phosphor-debug-collector/hostbootdump/<dump-entry-id>
Signed-off-by: Shantappa Teekappanavar <sbteeks@yahoo.com> Change-Id: Ibfe7cc6619cd99f303c6106e617bc636632d0940
show more ...
|
#
1ac6162d |
| 22-Jun-2021 |
Shantappa Teekappanavar <sbteeks@yahoo.com> |
watchdog: Collect hostboot dump when watchdog times out
The hostboot dump collection to be initiated by watchdog_timeout is disabled by default. When watchdog times out, only error message correspon
watchdog: Collect hostboot dump when watchdog times out
The hostboot dump collection to be initiated by watchdog_timeout is disabled by default. When watchdog times out, only error message corresponding to watchdog timeout is logged. To enable hostboot dump collection whenever watchdog times out, the meson option 'hostboot-dump-collection' must be enabled.
Testing - with meson option 'hostboot-dump-collection' enabled: Ran watchdog_timeout: case-1: CurrentHostState - off, AutoReboot - false - Verified PEL object was not created - Verified hostboot dump was not created - Verified the Host State changed to Quiesce
case-2: CurrentHostState - off, AutoReboot - true - Verified PEL object was created - Verified hostboot dump was not created - Verified the Host State changed to Running
case-3: CurrentHostState - Running, AutoBoot - false - Verified PEL object was not created - Verified hostboot dump was not created - Verified the Host State changed to Quiesce
case-4: CurrentHostState - Running, AutoBoot - true, default timeout = 300s - Verified PEL object was created - Verified hostboot dump was created - Observed Host state moving to either Running or Quiesce
case-5: CurrentHostState - Running, AutoBoot - true, specified timeout = 5s - Verified PEL object was created - Verified hostboot dump was created - Observed Host state moving to either Running or Quiesce
Docker Unit test: passed
Signed-off-by: Shantappa Teekappanavar <sbteeks@yahoo.com> Change-Id: Ib92d0c2f282816fb742cf07c1cb876b2cc093c12
show more ...
|