1ACPI ERST DEVICE 2================ 3 4The ACPI ERST device is utilized to support the ACPI Error Record 5Serialization Table, ERST, functionality. This feature is designed for 6storing error records in persistent storage for future reference 7and/or debugging. 8 9The ACPI specification[1], in Chapter "ACPI Platform Error Interfaces 10(APEI)", and specifically subsection "Error Serialization", outlines a 11method for storing error records into persistent storage. 12 13The format of error records is described in the UEFI specification[2], 14in Appendix N "Common Platform Error Record". 15 16While the ACPI specification allows for an NVRAM "mode" (see 17GET_ERROR_LOG_ADDRESS_RANGE_ATTRIBUTES) where non-volatile RAM is 18directly exposed for direct access by the OS/guest, this device 19implements the non-NVRAM "mode". This non-NVRAM "mode" is what is 20implemented by most BIOS (since flash memory requires programming 21operations in order to update its contents). Furthermore, as of the 22time of this writing, Linux only supports the non-NVRAM "mode". 23 24 25Background/Motivation 26--------------------- 27 28Linux uses the persistent storage filesystem, pstore, to record 29information (eg. dmesg tail) upon panics and shutdowns. Pstore is 30independent of, and runs before, kdump. In certain scenarios (ie. 31hosts/guests with root filesystems on NFS/iSCSI where networking 32software and/or hardware fails, and thus kdump fails), pstore may 33contain information available for post-mortem debugging. 34 35Two common storage backends for the pstore filesystem are ACPI ERST 36and UEFI. Most BIOS implement ACPI ERST. UEFI is not utilized in all 37guests. With QEMU supporting ACPI ERST, it becomes a viable pstore 38storage backend for virtual machines (as it is now for bare metal 39machines). 40 41Enabling support for ACPI ERST facilitates a consistent method to 42capture kernel panic information in a wide range of guests: from 43resource-constrained microvms to very large guests, and in particular, 44in direct-boot environments (which would lack UEFI run-time services). 45 46Note that Microsoft Windows also utilizes the ACPI ERST for certain 47crash information, if available[3]. 48 49 50Configuration|Usage 51------------------- 52 53To use ACPI ERST, a memory-backend-file object and acpi-erst device 54can be created, for example: 55 56 qemu ... 57 -object memory-backend-file,id=erstnvram,mem-path=acpi-erst.backing,size=0x10000,share=on \ 58 -device acpi-erst,memdev=erstnvram 59 60For proper operation, the ACPI ERST device needs a memory-backend-file 61object with the following parameters: 62 63 - id: The id of the memory-backend-file object is used to associate 64 this memory with the acpi-erst device. 65 - size: The size of the ACPI ERST backing storage. This parameter is 66 required. 67 - mem-path: The location of the ACPI ERST backing storage file. This 68 parameter is also required. 69 - share: The share=on parameter is required so that updates to the 70 ERST backing store are written to the file. 71 72and ERST device: 73 74 - memdev: Is the object id of the memory-backend-file. 75 - record_size: Specifies the size of the records (or slots) in the 76 backend storage. Must be a power of two value greater than or 77 equal to 4096 (PAGE_SIZE). 78 79 80PCI Interface 81------------- 82 83The ERST device is a PCI device with two BARs, one for accessing the 84programming registers, and the other for accessing the record exchange 85buffer. 86 87BAR0 contains the programming interface consisting of ACTION and VALUE 8864-bit registers. All ERST actions/operations/side effects happen on 89the write to the ACTION, by design. Any data needed by the action must 90be placed into VALUE prior to writing ACTION. Reading the VALUE 91simply returns the register contents, which can be updated by a 92previous ACTION. 93 94BAR1 contains the 8KiB record exchange buffer, which is the 95implemented maximum record size. 96 97 98Backend Storage Format 99---------------------- 100 101The backend storage is divided into fixed size "slots", 8KiB in 102length, with each slot storing a single record. Not all slots need to 103be occupied, and they need not be occupied in a contiguous fashion. 104The ability to clear/erase specific records allows for the formation 105of unoccupied slots. 106 107Slot 0 contains a backend storage header that identifies the contents 108as ERST and also facilitates efficient access to the records. 109Depending upon the size of the backend storage, additional slots will 110be designated to be a part of the slot 0 header. For example, at 8KiB, 111the slot 0 header can accomodate 1021 records. Thus a storage size 112of 8MiB (8KiB * 1024) requires an additional slot for use by the 113header. In this scenario, slot 0 and slot 1 form the backend storage 114header, and records can be stored starting at slot 2. 115 116Below is an example layout of the backend storage format (for storage 117size less than 8MiB). The size of the storage is a multiple of 8KiB, 118and contains N number of slots to store records. The example below 119shows two records (in CPER format) in the backend storage, while the 120remaining slots are empty/available. 121 122:: 123 124 Slot Record 125 <------------------ 8KiB --------------------> 126 +--------------------------------------------+ 127 0 | storage header | 128 +--------------------------------------------+ 129 1 | empty/available | 130 +--------------------------------------------+ 131 2 | CPER | 132 +--------------------------------------------+ 133 3 | CPER | 134 +--------------------------------------------+ 135 ... | | 136 +--------------------------------------------+ 137 N | empty/available | 138 +--------------------------------------------+ 139 140The storage header consists of some basic information and an array 141of CPER record_id's to efficiently access records in the backend 142storage. 143 144All fields in the header are stored in little endian format. 145 146:: 147 148 +--------------------------------------------+ 149 | magic | 0x0000 150 +--------------------------------------------+ 151 | record_offset | record_size | 0x0008 152 +--------------------------------------------+ 153 | record_count | reserved | version | 0x0010 154 +--------------------------------------------+ 155 | record_id[0] | 0x0018 156 +--------------------------------------------+ 157 | record_id[1] | 0x0020 158 +--------------------------------------------+ 159 | record_id[...] | 160 +--------------------------------------------+ 161 | record_id[N] | 0x1FF8 162 +--------------------------------------------+ 163 164The 'magic' field contains the value 0x524F545354535245. 165 166The 'record_size' field contains the value 0x2000, 8KiB. 167 168The 'record_offset' field points to the first record_id in the array, 1690x0018. 170 171The 'version' field contains 0x0100, the first version. 172 173The 'record_count' field contains the number of valid records in the 174backend storage. 175 176The 'record_id' array fields are the 64-bit record identifiers of the 177CPER record in the corresponding slot. Stated differently, the 178location of a CPER record_id in the record_id[] array provides the 179slot index for the corresponding record in the backend storage. 180 181Note that, for example, with a backend storage less than 8MiB, slot 0 182contains the header, so the record_id[0] will never contain a valid 183CPER record_id. Instead slot 1 is the first available slot and thus 184record_id_[1] may contain a CPER. 185 186A 'record_id' of all 0s or all 1s indicates an invalid record (ie. the 187slot is available). 188 189 190References 191---------- 192 193[1] "Advanced Configuration and Power Interface Specification", 194 version 4.0, June 2009. 195 196[2] "Unified Extensible Firmware Interface Specification", 197 version 2.1, October 2008. 198 199[3] "Windows Hardware Error Architecture", specfically 200 "Error Record Persistence Mechanism". 201