1*** Settings ***
2Documentation       Utility for RAS test scenarios through HOST & BMC.
3Resource            ../../lib/utils.robot
4Resource            ../../lib/ras/host_utils.robot
5Resource            ../../lib/resource.robot
6Resource            ../../lib/state_manager.robot
7Resource            ../../lib/boot_utils.robot
8Variables           ../../lib/ras/variables.py
9Variables           ../../data/variables.py
10Resource            ../../lib/dump_utils.robot
11
12Library             DateTime
13Library             OperatingSystem
14Library             random
15Library             Collections
16
17*** Variables ***
18${stack_mode}       normal
19
20*** Keywords ***
21
22Verify And Clear Gard Records On HOST
23    [Documentation]  Verify And Clear gard records on HOST.
24
25    ${output}=  Gard Operations On OS  list
26    Should Not Contain  ${output}  No GARD
27    Gard Operations On OS  clear all
28
29Verify Error Log Entry
30    [Documentation]  Verify error log entry & signature description.
31    [Arguments]  ${signature_desc}  ${log_prefix}
32    # Description of argument(s):
33    # signature_desc  Error log signature description.
34    # log_prefix      Log path prefix.
35
36    # TODO: Need to move this keyword to common utility.
37
38    Error Logs Should Exist
39
40    Collect eSEL Log  ${log_prefix}
41    ${error_log_file_path}=  Catenate  ${log_prefix}esel.txt
42    ${rc}  ${output}=  Run and Return RC and Output
43    ...  grep -i ${signature_desc} ${error_log_file_path}
44    Should Be Equal  ${rc}  ${0}
45    Should Not Be Empty  ${output}
46
47Inject Recoverable Error With Threshold Limit
48    [Documentation]  Inject and verify recoverable error on processor through
49    ...              BMC/HOST.
50    ...              Test sequence:
51    ...              1. Inject recoverable error on a given target
52    ...                 (e.g: Processor core, CAPP, MCA) through BMC/HOST.
53    ...              2. Check If HOST is running.
54    ...              3. Verify error log entry & signature description.
55    ...              4. Verify & clear gard records.
56    [Arguments]      ${interface_type}  ${fir_address}  ${value}  ${threshold_limit}
57    ...              ${signature_desc}  ${log_prefix}
58    # Description of argument(s):
59    # interface_type      Inject error through 'BMC' or 'HOST'.
60    # fir_address         FIR (Fault isolation register) value (e.g. 2011400).
61    # value               (e.g 2000000000000000).
62    # threshold_limit     Threshold limit (e.g 1, 5, 32).
63    # signature_desc      Error log signature description.
64    # log_prefix          Log path prefix.
65
66    Run Keyword  Inject Error Through ${interface_type}
67    ...  ${fir_address}  ${value}  ${threshold_limit}  ${master_proc_chip}
68
69    Is Host Running
70    ${output}=  Gard Operations On OS  list
71    Should Contain  ${output}  No GARD
72    Verify Error Log Entry  ${signature_desc}  ${log_prefix}
73    # TODO: Verify SOL console logs.
74
75
76Inject Unrecoverable Error
77    [Documentation]  Inject and verify unrecoverable error on processor through
78    ...              BMC/HOST.
79    ...              Test sequence:
80    ...              1. Inject unrecoverable error on a given target
81    ...                 (e.g: Processor core, CAPP, MCA) through BMC/HOST.
82    ...              2. Check If HOST is rebooted.
83    ...              3. Verify & clear gard records.
84    ...              4. Verify error log entry & signature description.
85    ...              5. Verify & clear dump entry.
86    [Arguments]      ${interface_type}  ${fir_address}  ${value}  ${threshold_limit}
87    ...              ${signature_desc}  ${log_prefix}  ${bmc_reboot}=${0}
88    # Description of argument(s):
89    # interface_type      Inject error through 'BMC' or 'HOST'.
90    # fir_address         FIR (Fault isolation register) value (e.g. 2011400).
91    # value               (e.g 2000000000000000).
92    # threshold_limit     Threshold limit (e.g 1, 5, 32).
93    # signature_desc      Error Log signature description.
94    #                     (e.g 'mcs(n0p0c0) (MCFIR[0]) mc internal recoverable')
95    # log_prefix          Log path prefix.
96    # bmc_reboot          Do bmc reboot If bmc_reboot is set.
97
98    Run Keyword  Inject Error Through ${interface_type}
99    ...  ${fir_address}  ${value}  ${threshold_limit}  ${master_proc_chip}
100
101    # Do BMC Reboot after error injection.
102    Run Keyword If  ${bmc_reboot}  Run Keywords
103    ...    Initiate BMC Reboot
104    ...    Wait For BMC Ready
105    ...    Initiate Host PowerOff
106    ...    Initiate Host Boot
107    ...  ELSE
108    ...    Wait Until Keyword Succeeds  500 sec  20 sec  Is Host Rebooted
109
110    Wait for OS
111    Verify Error Log Entry  ${signature_desc}  ${log_prefix}
112    Read Properties  ${DUMP_ENTRY_URI}list
113    Delete All BMC Dump
114    Verify And Clear Gard Records On HOST
115
116Fetch FIR Address Translation Value
117    [Documentation]  Fetch FIR address translation value through HOST.
118    [Arguments]  ${fir_address}  ${target_type}
119    # Description of argument(s):
120    # fir_address          FIR (Fault isolation register) value (e.g. '2011400').
121    # core_id              Core ID (e.g. '9').
122    # target_type          Target type (e.g. 'EX', 'EQ', 'C').
123
124    Login To OS Host
125    Copy Address Translation Utils To HOST OS
126
127    # Fetch processor chip IDs.
128    ${proc_chip_id}=  Get ProcChipId From OS  Processor  ${master_proc_chip}
129    # Example output:
130    # 00000000
131
132    ${core_ids}=  Get Core IDs From OS  ${proc_chip_id[-1]}
133    # Example output:
134    #./probe_cpus.sh | grep 'CHIP ID: 0' | cut -c21-22
135    # ['14', '15', '16', '17']
136
137    # Ignoring master core ID.
138    ${output}=  Get Slice From List  ${core_ids}  1
139    # Feth random non-master core ID.
140    ${core_ids_sub_list}=   Evaluate  random.sample(${core_ids}, 1)  random
141    ${core_id}=  Get From List  ${core_ids_sub_list}  0
142    ${translated_fir_addr}=  FIR Address Translation Through HOST
143    ...  ${fir_address}  ${core_id}  ${target_type}
144
145    [Return]  ${translated_fir_addr}
146
147RAS Test SetUp
148    [Documentation]  Validates input parameters.
149
150    Should Not Be Empty
151    ...  ${OS_HOST}  msg=You must provide DNS name/IP of the OS host.
152    Should Not Be Empty
153    ...  ${OS_USERNAME}  msg=You must provide OS host user name.
154    Should Not Be Empty
155    ...  ${OS_PASSWORD}  msg=You must provide OS host user password.
156
157    Smart Power Off
158
159    # Boot to OS.
160    REST Power On  quiet=${1}
161    # Adding delay after host bring up.
162    Sleep  60s
163
164RAS Suite Setup
165    [Documentation]  Create RAS log directory to store all RAS test logs.
166
167    ${RAS_LOG_DIR_PATH}=  Catenate  ${EXECDIR}/RAS_logs/
168    Set Suite Variable  ${RAS_LOG_DIR_PATH}
169    Set Suite Variable  ${master_proc_chip}  False
170
171    Create Directory  ${RAS_LOG_DIR_PATH}
172    OperatingSystem.Directory Should Exist  ${RAS_LOG_DIR_PATH}
173    Empty Directory  ${RAS_LOG_DIR_PATH}
174
175    Should Not Be Empty  ${ESEL_BIN_PATH}
176    Set Environment Variable  PATH  %{PATH}:${ESEL_BIN_PATH}
177
178    # Boot to Os.
179    REST Power On  quiet=${1}
180
181    # Check Opal-PRD service enabled on host.
182    ${opal_prd_state}=  Is Opal-PRD Service Enabled
183    Run Keyword If  '${opal_prd_state}' == 'disabled'
184    ...  Enable Opal-PRD Service On HOST
185
186RAS Suite Cleanup
187    [Documentation]  Perform RAS suite cleanup and verify that host
188    ...              boots after test suite run.
189
190    # Boot to OS.
191    REST Power On
192    Delete Error Logs
193    Gard Operations On OS  clear all
194
195
196Inject Error At HOST Boot Path
197
198    [Documentation]  Inject and verify recoverable error on processor through
199    ...              BMC using pdbg tool at HOST Boot path.
200    ...              Test sequence:
201    ...              1. Inject error on a given target
202    ...                 (e.g: Processor core, CAPP, MCA) through BMC using
203    ...                 pdbg tool at HOST Boot path.
204    ...              2. Check If HOST is rebooted and running.
205    ...              3. Verify error log entry & signature description.
206    ...              4. Verify & clear gard records.
207    [Arguments]      ${fir_address}  ${value}  ${signature_desc}  ${log_prefix}
208    # Description of argument(s):
209    # fir_address         FIR (Fault isolation register) value (e.g. 2011400).
210    # value               (e.g 2000000000000000).
211    # signature_desc      Error log signature description.
212    # log_prefix          Log path prefix.
213
214    Inject Error Through BMC At HOST Boot  ${fir_address}  ${value}
215
216    Wait Until Keyword Succeeds  500 sec  20 sec  Is Host Rebooted
217    Wait for OS
218    Verify Error Log Entry  ${signature_desc}  ${log_prefix}
219    Verify And Clear Gard Records On HOST
220