1*** Settings *** 2Documentation Utility for RAS test scenarios through HOST & BMC. 3Resource ../../lib/utils.robot 4Resource ../../lib/ras/host_utils.robot 5Resource ../../lib/resource.robot 6Resource ../../lib/state_manager.robot 7Resource ../../lib/boot_utils.robot 8Variables ../../lib/ras/variables.py 9Variables ../../data/variables.py 10Resource ../../lib/dump_utils.robot 11 12Library DateTime 13Library OperatingSystem 14Library random 15Library Collections 16 17*** Variables *** 18${stack_mode} normal 19 20*** Keywords *** 21 22Verify And Clear Gard Records On HOST 23 [Documentation] Verify And Clear gard records on HOST. 24 25 ${output}= Gard Operations On OS list 26 Should Not Contain ${output} No GARD 27 Gard Operations On OS clear all 28 29Verify Error Log Entry 30 [Documentation] Verify error log entry & signature description. 31 [Arguments] ${signature_desc} ${log_prefix} 32 # Description of argument(s): 33 # signature_desc Error log signature description. 34 # log_prefix Log path prefix. 35 36 # TODO: Need to move this keyword to common utility. 37 38 Error Logs Should Exist 39 40 Collect eSEL Log ${log_prefix} 41 ${error_log_file_path}= Catenate ${log_prefix}esel.txt 42 ${rc} ${output}= Run and Return RC and Output 43 ... grep -i ${signature_desc} ${error_log_file_path} 44 Should Be Equal ${rc} ${0} 45 Should Not Be Empty ${output} 46 47Inject Recoverable Error With Threshold Limit 48 [Documentation] Inject and verify recoverable error on processor through 49 ... BMC/HOST. 50 ... Test sequence: 51 ... 1. Inject recoverable error on a given target 52 ... (e.g: Processor core, CAPP, MCA) through BMC/HOST. 53 ... 2. Check If HOST is running. 54 ... 3. Verify error log entry & signature description. 55 ... 4. Verify & clear gard records. 56 [Arguments] ${interface_type} ${fir} ${chip_address} ${threshold_limit} 57 ... ${signature_desc} ${log_prefix} 58 # Description of argument(s): 59 # interface_type Inject error through 'BMC' or 'HOST'. 60 # fir FIR (Fault isolation register) value (e.g. 2011400). 61 # chip_address Chip address (e.g 2000000000000000). 62 # threshold_limit Threshold limit (e.g 1, 5, 32). 63 # signature_desc Error log signature description. 64 # log_prefix Log path prefix. 65 66 Run Keyword If '${interface_type}' == 'HOST' 67 ... Inject Error Through HOST ${fir} ${chip_address} ${threshold_limit} 68 ... ${master_proc_chip} 69 ... ELSE 70 ... Inject Error Through BMC ${fir} ${chip_address} ${threshold_limit} 71 ... ${master_proc_chip} 72 73 Is Host Running 74 ${output}= Gard Operations On OS list 75 Should Contain ${output} No GARD 76 Verify Error Log Entry ${signature_desc} ${log_prefix} 77 # TODO: Verify SOL console logs. 78 79 80Inject Unrecoverable Error 81 [Documentation] Inject and verify unrecoverable error on processor through 82 ... BMC/HOST. 83 ... Test sequence: 84 ... 1. Inject unrecoverable error on a given target 85 ... (e.g: Processor core, CAPP, MCA) through BMC/HOST. 86 ... 2. Check If HOST is rebooted. 87 ... 3. Verify & clear gard records. 88 ... 4. Verify error log entry & signature description. 89 ... 5. Verify & clear dump entry. 90 [Arguments] ${interface_type} ${fir} ${chip_address} ${threshold_limit} 91 ... ${signature_desc} ${log_prefix} 92 # Description of argument(s): 93 # interface_type Inject error through 'BMC' or 'HOST'. 94 # fir FIR (Fault isolation register) value (e.g. 2011400). 95 # chip_address Chip address (e.g 2000000000000000). 96 # threshold_limit Threshold limit (e.g 1, 5, 32). 97 # signature_desc Error Log signature description. 98 # (e.g 'mcs(n0p0c0) (MCFIR[0]) mc internal recoverable') 99 # log_prefix Log path prefix. 100 101 Run Keyword If '${interface_type}' == 'HOST' 102 ... Inject Error Through HOST ${fir} ${chip_address} ${threshold_limit} 103 ... ${master_proc_chip} 104 ... ELSE 105 ... Inject Error Through BMC ${fir} ${chip_address} ${threshold_limit} 106 ... ${master_proc_chip} 107 108 Wait Until Keyword Succeeds 500 sec 20 sec Is Host Rebooted 109 Wait for OS 110 Verify Error Log Entry ${signature_desc} ${log_prefix} 111 ${resp}= OpenBMC Get Request ${DUMP_ENTRY_URI}list 112 Should Not Be Equal As Strings ${resp.status_code} ${HTTP_NOT_FOUND} 113 Delete All BMC Dump 114 Verify And Clear Gard Records On HOST 115 116Fetch FIR Address Translation Value 117 [Documentation] Fetch FIR address translation value through HOST. 118 [Arguments] ${fir} ${target_type} 119 # Description of argument(s): 120 # fir FIR (Fault isolation register) value (e.g. '2011400'). 121 # core_id Core ID (e.g. '9'). 122 # target_type Target type (e.g. 'EX', 'EQ', 'C'). 123 124 Login To OS Host 125 Copy Address Translation Utils To HOST OS 126 127 # Fetch processor chip IDs. 128 ${proc_chip_id}= Get ProcChipId From OS Processor ${master_proc_chip} 129 # Example output: 130 # 00000000 131 132 ${core_ids}= Get Core IDs From OS ${proc_chip_id[-1]} 133 # Example output: 134 #./probe_cpus.sh | grep 'CHIP ID: 0' | cut -c21-22 135 # ['14', '15', '16', '17'] 136 137 # Ignoring master core ID. 138 ${output}= Get Slice From List ${core_ids} 1 139 # Feth random non-master core ID. 140 ${core_ids_sub_list}= Evaluate random.sample(${core_ids}, 1) random 141 ${core_id}= Get From List ${core_ids_sub_list} 0 142 ${translated_fir_addr}= FIR Address Translation Through HOST 143 ... ${fir} ${core_id} ${target_type} 144 145 [Return] ${translated_fir_addr} 146 147RAS Test SetUp 148 [Documentation] Validates input parameters. 149 150 Should Not Be Empty 151 ... ${OS_HOST} msg=You must provide DNS name/IP of the OS host. 152 Should Not Be Empty 153 ... ${OS_USERNAME} msg=You must provide OS host user name. 154 Should Not Be Empty 155 ... ${OS_PASSWORD} msg=You must provide OS host user password. 156 157 # Boot to OS. 158 REST Power On 159 # Adding delay after host bring up. 160 Sleep 60s 161 162RAS Suite Setup 163 [Documentation] Create RAS log directory to store all RAS test logs. 164 165 ${RAS_LOG_DIR_PATH}= Catenate ${EXECDIR}/RAS_logs/ 166 Set Suite Variable ${RAS_LOG_DIR_PATH} 167 Set Suite Variable ${master_proc_chip} False 168 169 Create Directory ${RAS_LOG_DIR_PATH} 170 OperatingSystem.Directory Should Exist ${RAS_LOG_DIR_PATH} 171 Empty Directory ${RAS_LOG_DIR_PATH} 172 173 Should Not Be Empty ${ESEL_BIN_PATH} 174 Set Environment Variable PATH %{PATH}:${ESEL_BIN_PATH} 175 176 # Boot to Os. 177 REST Power On 178 179 # Check Opal-PRD service enabled on host. 180 ${opal_prd_state}= Is Opal-PRD Service Enabled 181 Run Keyword If '${opal_prd_state}' == 'disabled' 182 ... Enable Opal-PRD Service On HOST 183 184RAS Suite Cleanup 185 [Documentation] Perform RAS suite cleanup and verify that host 186 ... boots after test suite run. 187 188 # Boot to OS. 189 REST Power On 190 Delete Error Logs 191 Gard Operations On OS clear all 192 193 194Inject Error At HOST Boot Path 195 196 [Documentation] Inject and verify recoverable error on processor through 197 ... BMC using pdbg tool at HOST Boot path. 198 ... Test sequence: 199 ... 1. Inject error on a given target 200 ... (e.g: Processor core, CAPP, MCA) through BMC using 201 ... pdbg tool at HOST Boot path. 202 ... 2. Check If HOST is rebooted and running. 203 ... 3. Verify error log entry & signature description. 204 ... 4. Verify & clear gard records. 205 [Arguments] ${fir} ${chip_address} ${signature_desc} ${log_prefix} 206 # Description of argument(s): 207 # fir FIR (Fault isolation register) value (e.g. 2011400). 208 # chip_address Chip address (e.g 2000000000000000). 209 # signature_desc Error log signature description. 210 # log_prefix Log path prefix. 211 212 Inject Error Through BMC At HOST Boot ${fir} ${chip_address} 213 214 Wait Until Keyword Succeeds 500 sec 20 sec Is Host Rebooted 215 Wait for OS 216 Verify Error Log Entry ${signature_desc} ${log_prefix} 217 Verify And Clear Gard Records On HOST 218