1*0ef0506eSEric DeVolderACPI ERST DEVICE 2*0ef0506eSEric DeVolder================ 3*0ef0506eSEric DeVolder 4*0ef0506eSEric DeVolderThe ACPI ERST device is utilized to support the ACPI Error Record 5*0ef0506eSEric DeVolderSerialization Table, ERST, functionality. This feature is designed for 6*0ef0506eSEric DeVolderstoring error records in persistent storage for future reference 7*0ef0506eSEric DeVolderand/or debugging. 8*0ef0506eSEric DeVolder 9*0ef0506eSEric DeVolderThe ACPI specification[1], in Chapter "ACPI Platform Error Interfaces 10*0ef0506eSEric DeVolder(APEI)", and specifically subsection "Error Serialization", outlines a 11*0ef0506eSEric DeVoldermethod for storing error records into persistent storage. 12*0ef0506eSEric DeVolder 13*0ef0506eSEric DeVolderThe format of error records is described in the UEFI specification[2], 14*0ef0506eSEric DeVolderin Appendix N "Common Platform Error Record". 15*0ef0506eSEric DeVolder 16*0ef0506eSEric DeVolderWhile the ACPI specification allows for an NVRAM "mode" (see 17*0ef0506eSEric DeVolderGET_ERROR_LOG_ADDRESS_RANGE_ATTRIBUTES) where non-volatile RAM is 18*0ef0506eSEric DeVolderdirectly exposed for direct access by the OS/guest, this device 19*0ef0506eSEric DeVolderimplements the non-NVRAM "mode". This non-NVRAM "mode" is what is 20*0ef0506eSEric DeVolderimplemented by most BIOS (since flash memory requires programming 21*0ef0506eSEric DeVolderoperations in order to update its contents). Furthermore, as of the 22*0ef0506eSEric DeVoldertime of this writing, Linux only supports the non-NVRAM "mode". 23*0ef0506eSEric DeVolder 24*0ef0506eSEric DeVolder 25*0ef0506eSEric DeVolderBackground/Motivation 26*0ef0506eSEric DeVolder--------------------- 27*0ef0506eSEric DeVolder 28*0ef0506eSEric DeVolderLinux uses the persistent storage filesystem, pstore, to record 29*0ef0506eSEric DeVolderinformation (eg. dmesg tail) upon panics and shutdowns. Pstore is 30*0ef0506eSEric DeVolderindependent of, and runs before, kdump. In certain scenarios (ie. 31*0ef0506eSEric DeVolderhosts/guests with root filesystems on NFS/iSCSI where networking 32*0ef0506eSEric DeVoldersoftware and/or hardware fails, and thus kdump fails), pstore may 33*0ef0506eSEric DeVoldercontain information available for post-mortem debugging. 34*0ef0506eSEric DeVolder 35*0ef0506eSEric DeVolderTwo common storage backends for the pstore filesystem are ACPI ERST 36*0ef0506eSEric DeVolderand UEFI. Most BIOS implement ACPI ERST. UEFI is not utilized in all 37*0ef0506eSEric DeVolderguests. With QEMU supporting ACPI ERST, it becomes a viable pstore 38*0ef0506eSEric DeVolderstorage backend for virtual machines (as it is now for bare metal 39*0ef0506eSEric DeVoldermachines). 40*0ef0506eSEric DeVolder 41*0ef0506eSEric DeVolderEnabling support for ACPI ERST facilitates a consistent method to 42*0ef0506eSEric DeVoldercapture kernel panic information in a wide range of guests: from 43*0ef0506eSEric DeVolderresource-constrained microvms to very large guests, and in particular, 44*0ef0506eSEric DeVolderin direct-boot environments (which would lack UEFI run-time services). 45*0ef0506eSEric DeVolder 46*0ef0506eSEric DeVolderNote that Microsoft Windows also utilizes the ACPI ERST for certain 47*0ef0506eSEric DeVoldercrash information, if available[3]. 48*0ef0506eSEric DeVolder 49*0ef0506eSEric DeVolder 50*0ef0506eSEric DeVolderConfiguration|Usage 51*0ef0506eSEric DeVolder------------------- 52*0ef0506eSEric DeVolder 53*0ef0506eSEric DeVolderTo use ACPI ERST, a memory-backend-file object and acpi-erst device 54*0ef0506eSEric DeVoldercan be created, for example: 55*0ef0506eSEric DeVolder 56*0ef0506eSEric DeVolder qemu ... 57*0ef0506eSEric DeVolder -object memory-backend-file,id=erstnvram,mem-path=acpi-erst.backing,size=0x10000,share=on \ 58*0ef0506eSEric DeVolder -device acpi-erst,memdev=erstnvram 59*0ef0506eSEric DeVolder 60*0ef0506eSEric DeVolderFor proper operation, the ACPI ERST device needs a memory-backend-file 61*0ef0506eSEric DeVolderobject with the following parameters: 62*0ef0506eSEric DeVolder 63*0ef0506eSEric DeVolder - id: The id of the memory-backend-file object is used to associate 64*0ef0506eSEric DeVolder this memory with the acpi-erst device. 65*0ef0506eSEric DeVolder - size: The size of the ACPI ERST backing storage. This parameter is 66*0ef0506eSEric DeVolder required. 67*0ef0506eSEric DeVolder - mem-path: The location of the ACPI ERST backing storage file. This 68*0ef0506eSEric DeVolder parameter is also required. 69*0ef0506eSEric DeVolder - share: The share=on parameter is required so that updates to the 70*0ef0506eSEric DeVolder ERST backing store are written to the file. 71*0ef0506eSEric DeVolder 72*0ef0506eSEric DeVolderand ERST device: 73*0ef0506eSEric DeVolder 74*0ef0506eSEric DeVolder - memdev: Is the object id of the memory-backend-file. 75*0ef0506eSEric DeVolder - record_size: Specifies the size of the records (or slots) in the 76*0ef0506eSEric DeVolder backend storage. Must be a power of two value greater than or 77*0ef0506eSEric DeVolder equal to 4096 (PAGE_SIZE). 78*0ef0506eSEric DeVolder 79*0ef0506eSEric DeVolder 80*0ef0506eSEric DeVolderPCI Interface 81*0ef0506eSEric DeVolder------------- 82*0ef0506eSEric DeVolder 83*0ef0506eSEric DeVolderThe ERST device is a PCI device with two BARs, one for accessing the 84*0ef0506eSEric DeVolderprogramming registers, and the other for accessing the record exchange 85*0ef0506eSEric DeVolderbuffer. 86*0ef0506eSEric DeVolder 87*0ef0506eSEric DeVolderBAR0 contains the programming interface consisting of ACTION and VALUE 88*0ef0506eSEric DeVolder64-bit registers. All ERST actions/operations/side effects happen on 89*0ef0506eSEric DeVolderthe write to the ACTION, by design. Any data needed by the action must 90*0ef0506eSEric DeVolderbe placed into VALUE prior to writing ACTION. Reading the VALUE 91*0ef0506eSEric DeVoldersimply returns the register contents, which can be updated by a 92*0ef0506eSEric DeVolderprevious ACTION. 93*0ef0506eSEric DeVolder 94*0ef0506eSEric DeVolderBAR1 contains the 8KiB record exchange buffer, which is the 95*0ef0506eSEric DeVolderimplemented maximum record size. 96*0ef0506eSEric DeVolder 97*0ef0506eSEric DeVolder 98*0ef0506eSEric DeVolderBackend Storage Format 99*0ef0506eSEric DeVolder---------------------- 100*0ef0506eSEric DeVolder 101*0ef0506eSEric DeVolderThe backend storage is divided into fixed size "slots", 8KiB in 102*0ef0506eSEric DeVolderlength, with each slot storing a single record. Not all slots need to 103*0ef0506eSEric DeVolderbe occupied, and they need not be occupied in a contiguous fashion. 104*0ef0506eSEric DeVolderThe ability to clear/erase specific records allows for the formation 105*0ef0506eSEric DeVolderof unoccupied slots. 106*0ef0506eSEric DeVolder 107*0ef0506eSEric DeVolderSlot 0 contains a backend storage header that identifies the contents 108*0ef0506eSEric DeVolderas ERST and also facilitates efficient access to the records. 109*0ef0506eSEric DeVolderDepending upon the size of the backend storage, additional slots will 110*0ef0506eSEric DeVolderbe designated to be a part of the slot 0 header. For example, at 8KiB, 111*0ef0506eSEric DeVolderthe slot 0 header can accomodate 1021 records. Thus a storage size 112*0ef0506eSEric DeVolderof 8MiB (8KiB * 1024) requires an additional slot for use by the 113*0ef0506eSEric DeVolderheader. In this scenario, slot 0 and slot 1 form the backend storage 114*0ef0506eSEric DeVolderheader, and records can be stored starting at slot 2. 115*0ef0506eSEric DeVolder 116*0ef0506eSEric DeVolderBelow is an example layout of the backend storage format (for storage 117*0ef0506eSEric DeVoldersize less than 8MiB). The size of the storage is a multiple of 8KiB, 118*0ef0506eSEric DeVolderand contains N number of slots to store records. The example below 119*0ef0506eSEric DeVoldershows two records (in CPER format) in the backend storage, while the 120*0ef0506eSEric DeVolderremaining slots are empty/available. 121*0ef0506eSEric DeVolder 122*0ef0506eSEric DeVolder:: 123*0ef0506eSEric DeVolder 124*0ef0506eSEric DeVolder Slot Record 125*0ef0506eSEric DeVolder <------------------ 8KiB --------------------> 126*0ef0506eSEric DeVolder +--------------------------------------------+ 127*0ef0506eSEric DeVolder 0 | storage header | 128*0ef0506eSEric DeVolder +--------------------------------------------+ 129*0ef0506eSEric DeVolder 1 | empty/available | 130*0ef0506eSEric DeVolder +--------------------------------------------+ 131*0ef0506eSEric DeVolder 2 | CPER | 132*0ef0506eSEric DeVolder +--------------------------------------------+ 133*0ef0506eSEric DeVolder 3 | CPER | 134*0ef0506eSEric DeVolder +--------------------------------------------+ 135*0ef0506eSEric DeVolder ... | | 136*0ef0506eSEric DeVolder +--------------------------------------------+ 137*0ef0506eSEric DeVolder N | empty/available | 138*0ef0506eSEric DeVolder +--------------------------------------------+ 139*0ef0506eSEric DeVolder 140*0ef0506eSEric DeVolderThe storage header consists of some basic information and an array 141*0ef0506eSEric DeVolderof CPER record_id's to efficiently access records in the backend 142*0ef0506eSEric DeVolderstorage. 143*0ef0506eSEric DeVolder 144*0ef0506eSEric DeVolderAll fields in the header are stored in little endian format. 145*0ef0506eSEric DeVolder 146*0ef0506eSEric DeVolder:: 147*0ef0506eSEric DeVolder 148*0ef0506eSEric DeVolder +--------------------------------------------+ 149*0ef0506eSEric DeVolder | magic | 0x0000 150*0ef0506eSEric DeVolder +--------------------------------------------+ 151*0ef0506eSEric DeVolder | record_offset | record_size | 0x0008 152*0ef0506eSEric DeVolder +--------------------------------------------+ 153*0ef0506eSEric DeVolder | record_count | reserved | version | 0x0010 154*0ef0506eSEric DeVolder +--------------------------------------------+ 155*0ef0506eSEric DeVolder | record_id[0] | 0x0018 156*0ef0506eSEric DeVolder +--------------------------------------------+ 157*0ef0506eSEric DeVolder | record_id[1] | 0x0020 158*0ef0506eSEric DeVolder +--------------------------------------------+ 159*0ef0506eSEric DeVolder | record_id[...] | 160*0ef0506eSEric DeVolder +--------------------------------------------+ 161*0ef0506eSEric DeVolder | record_id[N] | 0x1FF8 162*0ef0506eSEric DeVolder +--------------------------------------------+ 163*0ef0506eSEric DeVolder 164*0ef0506eSEric DeVolderThe 'magic' field contains the value 0x524F545354535245. 165*0ef0506eSEric DeVolder 166*0ef0506eSEric DeVolderThe 'record_size' field contains the value 0x2000, 8KiB. 167*0ef0506eSEric DeVolder 168*0ef0506eSEric DeVolderThe 'record_offset' field points to the first record_id in the array, 169*0ef0506eSEric DeVolder0x0018. 170*0ef0506eSEric DeVolder 171*0ef0506eSEric DeVolderThe 'version' field contains 0x0100, the first version. 172*0ef0506eSEric DeVolder 173*0ef0506eSEric DeVolderThe 'record_count' field contains the number of valid records in the 174*0ef0506eSEric DeVolderbackend storage. 175*0ef0506eSEric DeVolder 176*0ef0506eSEric DeVolderThe 'record_id' array fields are the 64-bit record identifiers of the 177*0ef0506eSEric DeVolderCPER record in the corresponding slot. Stated differently, the 178*0ef0506eSEric DeVolderlocation of a CPER record_id in the record_id[] array provides the 179*0ef0506eSEric DeVolderslot index for the corresponding record in the backend storage. 180*0ef0506eSEric DeVolder 181*0ef0506eSEric DeVolderNote that, for example, with a backend storage less than 8MiB, slot 0 182*0ef0506eSEric DeVoldercontains the header, so the record_id[0] will never contain a valid 183*0ef0506eSEric DeVolderCPER record_id. Instead slot 1 is the first available slot and thus 184*0ef0506eSEric DeVolderrecord_id_[1] may contain a CPER. 185*0ef0506eSEric DeVolder 186*0ef0506eSEric DeVolderA 'record_id' of all 0s or all 1s indicates an invalid record (ie. the 187*0ef0506eSEric DeVolderslot is available). 188*0ef0506eSEric DeVolder 189*0ef0506eSEric DeVolder 190*0ef0506eSEric DeVolderReferences 191*0ef0506eSEric DeVolder---------- 192*0ef0506eSEric DeVolder 193*0ef0506eSEric DeVolder[1] "Advanced Configuration and Power Interface Specification", 194*0ef0506eSEric DeVolder version 4.0, June 2009. 195*0ef0506eSEric DeVolder 196*0ef0506eSEric DeVolder[2] "Unified Extensible Firmware Interface Specification", 197*0ef0506eSEric DeVolder version 2.1, October 2008. 198*0ef0506eSEric DeVolder 199*0ef0506eSEric DeVolder[3] "Windows Hardware Error Architecture", specfically 200*0ef0506eSEric DeVolder "Error Record Persistence Mechanism". 201