1*** Settings ***
2Documentation       Utility for RAS test scenarios through HOST & BMC.
3Resource            ../../lib/utils.robot
4Resource            ../../lib/ras/host_utils.robot
5Resource            ../../lib/resource.robot
6Resource            ../../lib/state_manager.robot
7Resource            ../../lib/boot_utils.robot
8Variables           ../../lib/ras/variables.py
9Variables           ../../data/variables.py
10Resource            ../../lib/dump_utils.robot
11
12Library             DateTime
13Library             OperatingSystem
14Library             random
15Library             Collections
16
17*** Variables ***
18${stack_mode}       normal
19
20*** Keywords ***
21
22Verify And Clear Gard Records On HOST
23    [Documentation]  Verify And Clear gard records on HOST.
24
25    ${output}=  Gard Operations On OS  list
26    Should Not Contain  ${output}  No GARD
27    Gard Operations On OS  clear all
28
29Verify Error Log Entry
30    [Documentation]  Verify error log entry & signature description.
31    [Arguments]  ${signature_desc}  ${log_prefix}
32    # Description of argument(s):
33    # signature_desc  Error log signature description.
34    # log_prefix      Log path prefix.
35
36    # TODO: Need to move this keyword to common utility.
37
38    Error Logs Should Exist
39
40    Collect eSEL Log  ${log_prefix}
41    ${error_log_file_path}=  Catenate  ${log_prefix}esel.txt
42    ${rc}  ${output}=  Run and Return RC and Output
43    ...  grep -i ${signature_desc} ${error_log_file_path}
44    Should Be Equal  ${rc}  ${0}
45    Should Not Be Empty  ${output}
46
47Inject Recoverable Error With Threshold Limit
48    [Documentation]  Inject and verify recoverable error on processor through
49    ...              BMC/HOST.
50    ...              Test sequence:
51    ...              1. Inject recoverable error on a given target
52    ...                 (e.g: Processor core, CAPP MCA) through BMC/HOST.
53    ...              2. Check If HOST is running.
54    ...              3. Verify error log entry & signature description.
55    ...              4. Verify & clear gard records.
56    [Arguments]      ${interface_type}  ${fir}  ${chip_address}  ${threshold_limit}
57    ...              ${signature_desc}  ${log_prefix}
58    # Description of argument(s):
59    # interface_type      Inject error through 'BMC' or 'HOST'.
60    # fir                 FIR (Fault isolation register) value (e.g. 2011400).
61    # chip_address        Chip address (e.g 2000000000000000).
62    # threshold_limit     Threshold limit (e.g 1, 5, 32).
63    # signature_desc      Error log signature description.
64    # log_prefix          Log path prefix.
65
66    Run Keyword If  '${interface_type}' == 'HOST'
67    ...     Inject Error Through HOST  ${fir}  ${chip_address}  ${threshold_limit}
68    ...     ${master_proc_chip}
69    ...  ELSE
70    ...     Inject Error Through BMC  ${fir}  ${chip_address}  ${threshold_limit}
71    ...     ${master_proc_chip}
72
73    Is Host Running
74    ${output}=  Gard Operations On OS  list
75    Should Contain  ${output}  No GARD
76    Verify Error Log Entry  ${signature_desc}  ${log_prefix}
77    # TODO: Verify SOL console logs.
78
79
80Inject Unrecoverable Error
81    [Documentation]  Inject and verify unrecoverable error on processor through
82    ...              BMC/HOST.
83    ...              Test sequence:
84    ...              1. Inject unrecoverable error on a given target
85    ...                 (e.g: Processor core, CAPP MCA) through BMC/HOST.
86    ...              2. Check If HOST is rebooted.
87    ...              3. Verify & clear gard records.
88    ...              4. Verify error log entry & signature description.
89    ...              5. Verify & clear dump entry.
90    [Arguments]      ${interface_type}  ${fir}  ${chip_address}  ${threshold_limit}
91    ...              ${signature_desc}  ${log_prefix}
92    # Description of argument(s):
93    # interface_type      Inject error through 'BMC' or 'HOST'.
94    # fir                 FIR (Fault isolation register) value (e.g. 2011400).
95    # chip_address        Chip address (e.g 2000000000000000).
96    # threshold_limit     Threshold limit (e.g 1, 5, 32).
97    # signature_desc      Error Log signature description.
98    #                     (e.g 'mcs(n0p0c0) (MCFIR[0]) mc internal recoverable')
99    # log_prefix          Log path prefix.
100
101    Run Keyword If  '${interface_type}' == 'HOST'
102    ...     Inject Error Through HOST  ${fir}  ${chip_address}  ${threshold_limit}
103    ...     ${master_proc_chip}
104    ...  ELSE
105    ...     Inject Error Through BMC  ${fir}  ${chip_address}  ${threshold_limit}
106    ...     ${master_proc_chip}
107
108    Wait Until Keyword Succeeds  500 sec  20 sec  Is Host Rebooted
109    Wait for OS
110    Verify Error Log Entry  ${signature_desc}  ${log_prefix}
111    ${resp}=  OpenBMC Get Request  ${DUMP_ENTRY_URI}list
112    Should Not Be Equal As Strings  ${resp.status_code}  ${HTTP_NOT_FOUND}
113    Delete All BMC Dump
114    Verify And Clear Gard Records On HOST
115
116Fetch FIR Address Translation Value
117    [Documentation]  Fetch FIR address translation value through HOST.
118    [Arguments]  ${fir}  ${target_type}
119    # Description of argument(s):
120    # fir                  FIR (Fault isolation register) value (e.g. '2011400').
121    # core_id              Core ID (e.g. '9').
122    # target_type          Target type (e.g. 'EX', 'EQ', 'C').
123
124    Login To OS Host
125    Copy Address Translation Utils To HOST OS
126
127    # Fetch processor chip IDs.
128    ${proc_chip_id}=  Get ProcChipId From OS  Processor  ${master_proc_chip}
129    # Example output:
130    # 00000000
131
132    ${core_ids}=  Get Core IDs From OS  ${proc_chip_id[-1]}
133    # Example output:
134    #./probe_cpus.sh | grep 'CHIP ID: 0' | cut -c21-22
135    # ['14', '15', '16', '17']
136
137    # Ignoring master core ID.
138    ${output}=  Get Slice From List  ${core_ids}  1
139    # Feth random non-master core ID.
140    ${core_ids_sub_list}=   Evaluate  random.sample(${core_ids}, 1)  random
141    ${core_id}=  Get From List  ${core_ids_sub_list}  0
142    ${translated_fir_addr}=  FIR Address Translation Through HOST
143    ...  ${fir}  ${core_id}  ${target_type}
144
145    [Return]  ${translated_fir_addr}
146
147RAS Test SetUp
148    [Documentation]  Validates input parameters.
149
150    Should Not Be Empty
151    ...  ${OS_HOST}  msg=You must provide DNS name/IP of the OS host.
152    Should Not Be Empty
153    ...  ${OS_USERNAME}  msg=You must provide OS host user name.
154    Should Not Be Empty
155    ...  ${OS_PASSWORD}  msg=You must provide OS host user password.
156
157    # Boot to OS.
158    REST Power On  quiet=${1}
159    # Adding delay after host bring up.
160    Sleep  60s
161
162RAS Suite Setup
163    [Documentation]  Create RAS log directory to store all RAS test logs.
164
165    ${RAS_LOG_DIR_PATH}=  Catenate  ${EXECDIR}/RAS_logs/
166    Set Suite Variable  ${RAS_LOG_DIR_PATH}
167    Set Suite Variable  ${master_proc_chip}  False
168
169    Create Directory  ${RAS_LOG_DIR_PATH}
170    OperatingSystem.Directory Should Exist  ${RAS_LOG_DIR_PATH}
171    Empty Directory  ${RAS_LOG_DIR_PATH}
172
173    Should Not Be Empty  ${ESEL_BIN_PATH}
174    Set Environment Variable  PATH  %{PATH}:${ESEL_BIN_PATH}
175
176    # Boot to Os.
177    REST Power On  quiet=${1}
178
179    # Check Opal-PRD service enabled on host.
180    ${opal_prd_state}=  Is Opal-PRD Service Enabled
181    Run Keyword If  '${opal_prd_state}' == 'disabled'
182    ...  Enable Opal-PRD Service On HOST
183
184RAS Suite Cleanup
185    [Documentation]  Perform RAS suite cleanup and verify that host
186    ...              boots after test suite run.
187
188    # Boot to OS.
189    REST Power On  quiet=${1}
190    Delete Error Logs
191    Gard Operations On OS  clear all
192