1*** Settings ***
2Documentation    Stress the system using HTX exerciser.
3
4Resource         ../syslib/utils_os.robot
5
6Test Setup      Pre Test Case Execution
7Test Teardown   Post Test Case Execution
8
9*** Variables ****
10
11${stack_mode}        skip
12
13*** Test Cases ***
14
15GPU Stress Test
16    [Documentation]  Stress the GPU using HTX exerciser.
17    [Tags]  GPU_Stress_Test
18
19    Rprintn
20    Rpvars  HTX_DURATION  HTX_INTERVAL
21
22    Repeat Keyword  ${HTX_LOOP} times  Execute GPU Test
23
24
25*** Keywords ***
26
27Execute GPU Test
28    [Documentation]  Start HTX exerciser.
29    # Test Flow:
30    #              - Power on
31    #              - Establish SSH connection session
32    #              - Collect GPU nvidia status output
33    #              - Create HTX mdt profile
34    #              - Run GPU specific HTX exerciser
35    #              - Check HTX status for errors
36
37    # Collect data before the test starts.
38    Collect NVIDIA Log File  start
39
40    Run Keyword If  '${HTX_MDT_PROFILE}' == 'mdt.bu'
41    ...  Create Default MDT Profile
42
43    Run MDT Profile
44
45    Loop HTX Health Check
46
47    # Post test loop look out for dmesg error logged.
48    Check For Errors On OS Dmesg Log
49
50    Shutdown HTX Exerciser
51
52    Rprint Timen  HTX Test ran for: ${HTX_DURATION}
53
54
55Loop HTX Health Check
56    [Documentation]  Run until HTX exerciser fails.
57
58    Repeat Keyword  ${HTX_DURATION}
59    ...  Run Keywords  Check HTX Run Status
60    ...  AND  Sleep  ${HTX_INTERVAL}
61
62
63Post Test Case Execution
64    [Documentation]  Do the post test teardown.
65    # 1. Shut down HTX exerciser if test Failed.
66    # 2. Capture FFDC on test failure.
67    # 3. Close all open SSH connections.
68
69    # Keep HTX running if user set HTX_KEEP_RUNNING to 1.
70    Run Keyword If  '${TEST_STATUS}' == 'FAIL' and ${HTX_KEEP_RUNNING} == ${0}
71    ...  Shutdown HTX Exerciser
72
73    # Collect nvidia-smi output data on exit.
74    Collect NVIDIA Log File  end
75
76    FFDC On Test Case Fail
77    Close All Connections
78