xref: /openbmc/qemu/docs/specs/acpi_erst.rst (revision 93f3dd604825824a7239aaf704baf74730aa3007)
10ef0506eSEric DeVolderACPI ERST DEVICE
20ef0506eSEric DeVolder================
30ef0506eSEric DeVolder
40ef0506eSEric DeVolderThe ACPI ERST device is utilized to support the ACPI Error Record
50ef0506eSEric DeVolderSerialization Table, ERST, functionality. This feature is designed for
60ef0506eSEric DeVolderstoring error records in persistent storage for future reference
70ef0506eSEric DeVolderand/or debugging.
80ef0506eSEric DeVolder
90ef0506eSEric DeVolderThe ACPI specification[1], in Chapter "ACPI Platform Error Interfaces
100ef0506eSEric DeVolder(APEI)", and specifically subsection "Error Serialization", outlines a
110ef0506eSEric DeVoldermethod for storing error records into persistent storage.
120ef0506eSEric DeVolder
130ef0506eSEric DeVolderThe format of error records is described in the UEFI specification[2],
140ef0506eSEric DeVolderin Appendix N "Common Platform Error Record".
150ef0506eSEric DeVolder
160ef0506eSEric DeVolderWhile the ACPI specification allows for an NVRAM "mode" (see
170ef0506eSEric DeVolderGET_ERROR_LOG_ADDRESS_RANGE_ATTRIBUTES) where non-volatile RAM is
180ef0506eSEric DeVolderdirectly exposed for direct access by the OS/guest, this device
190ef0506eSEric DeVolderimplements the non-NVRAM "mode". This non-NVRAM "mode" is what is
200ef0506eSEric DeVolderimplemented by most BIOS (since flash memory requires programming
210ef0506eSEric DeVolderoperations in order to update its contents). Furthermore, as of the
220ef0506eSEric DeVoldertime of this writing, Linux only supports the non-NVRAM "mode".
230ef0506eSEric DeVolder
240ef0506eSEric DeVolder
250ef0506eSEric DeVolderBackground/Motivation
260ef0506eSEric DeVolder---------------------
270ef0506eSEric DeVolder
280ef0506eSEric DeVolderLinux uses the persistent storage filesystem, pstore, to record
290ef0506eSEric DeVolderinformation (eg. dmesg tail) upon panics and shutdowns.  Pstore is
300ef0506eSEric DeVolderindependent of, and runs before, kdump.  In certain scenarios (ie.
310ef0506eSEric DeVolderhosts/guests with root filesystems on NFS/iSCSI where networking
320ef0506eSEric DeVoldersoftware and/or hardware fails, and thus kdump fails), pstore may
330ef0506eSEric DeVoldercontain information available for post-mortem debugging.
340ef0506eSEric DeVolder
350ef0506eSEric DeVolderTwo common storage backends for the pstore filesystem are ACPI ERST
360ef0506eSEric DeVolderand UEFI. Most BIOS implement ACPI ERST. UEFI is not utilized in all
370ef0506eSEric DeVolderguests. With QEMU supporting ACPI ERST, it becomes a viable pstore
380ef0506eSEric DeVolderstorage backend for virtual machines (as it is now for bare metal
390ef0506eSEric DeVoldermachines).
400ef0506eSEric DeVolder
410ef0506eSEric DeVolderEnabling support for ACPI ERST facilitates a consistent method to
420ef0506eSEric DeVoldercapture kernel panic information in a wide range of guests: from
430ef0506eSEric DeVolderresource-constrained microvms to very large guests, and in particular,
440ef0506eSEric DeVolderin direct-boot environments (which would lack UEFI run-time services).
450ef0506eSEric DeVolder
460ef0506eSEric DeVolderNote that Microsoft Windows also utilizes the ACPI ERST for certain
470ef0506eSEric DeVoldercrash information, if available[3].
480ef0506eSEric DeVolder
490ef0506eSEric DeVolder
500ef0506eSEric DeVolderConfiguration|Usage
510ef0506eSEric DeVolder-------------------
520ef0506eSEric DeVolder
530ef0506eSEric DeVolderTo use ACPI ERST, a memory-backend-file object and acpi-erst device
540ef0506eSEric DeVoldercan be created, for example:
550ef0506eSEric DeVolder
560ef0506eSEric DeVolder qemu ...
570ef0506eSEric DeVolder -object memory-backend-file,id=erstnvram,mem-path=acpi-erst.backing,size=0x10000,share=on \
580ef0506eSEric DeVolder -device acpi-erst,memdev=erstnvram
590ef0506eSEric DeVolder
600ef0506eSEric DeVolderFor proper operation, the ACPI ERST device needs a memory-backend-file
610ef0506eSEric DeVolderobject with the following parameters:
620ef0506eSEric DeVolder
630ef0506eSEric DeVolder - id: The id of the memory-backend-file object is used to associate
640ef0506eSEric DeVolder   this memory with the acpi-erst device.
650ef0506eSEric DeVolder - size: The size of the ACPI ERST backing storage. This parameter is
660ef0506eSEric DeVolder   required.
670ef0506eSEric DeVolder - mem-path: The location of the ACPI ERST backing storage file. This
680ef0506eSEric DeVolder   parameter is also required.
690ef0506eSEric DeVolder - share: The share=on parameter is required so that updates to the
700ef0506eSEric DeVolder   ERST backing store are written to the file.
710ef0506eSEric DeVolder
720ef0506eSEric DeVolderand ERST device:
730ef0506eSEric DeVolder
740ef0506eSEric DeVolder - memdev: Is the object id of the memory-backend-file.
750ef0506eSEric DeVolder - record_size: Specifies the size of the records (or slots) in the
760ef0506eSEric DeVolder   backend storage. Must be a power of two value greater than or
770ef0506eSEric DeVolder   equal to 4096 (PAGE_SIZE).
780ef0506eSEric DeVolder
790ef0506eSEric DeVolder
800ef0506eSEric DeVolderPCI Interface
810ef0506eSEric DeVolder-------------
820ef0506eSEric DeVolder
830ef0506eSEric DeVolderThe ERST device is a PCI device with two BARs, one for accessing the
840ef0506eSEric DeVolderprogramming registers, and the other for accessing the record exchange
850ef0506eSEric DeVolderbuffer.
860ef0506eSEric DeVolder
870ef0506eSEric DeVolderBAR0 contains the programming interface consisting of ACTION and VALUE
880ef0506eSEric DeVolder64-bit registers.  All ERST actions/operations/side effects happen on
890ef0506eSEric DeVolderthe write to the ACTION, by design. Any data needed by the action must
900ef0506eSEric DeVolderbe placed into VALUE prior to writing ACTION.  Reading the VALUE
910ef0506eSEric DeVoldersimply returns the register contents, which can be updated by a
920ef0506eSEric DeVolderprevious ACTION.
930ef0506eSEric DeVolder
940ef0506eSEric DeVolderBAR1 contains the 8KiB record exchange buffer, which is the
950ef0506eSEric DeVolderimplemented maximum record size.
960ef0506eSEric DeVolder
970ef0506eSEric DeVolder
980ef0506eSEric DeVolderBackend Storage Format
990ef0506eSEric DeVolder----------------------
1000ef0506eSEric DeVolder
1010ef0506eSEric DeVolderThe backend storage is divided into fixed size "slots", 8KiB in
1020ef0506eSEric DeVolderlength, with each slot storing a single record.  Not all slots need to
1030ef0506eSEric DeVolderbe occupied, and they need not be occupied in a contiguous fashion.
1040ef0506eSEric DeVolderThe ability to clear/erase specific records allows for the formation
1050ef0506eSEric DeVolderof unoccupied slots.
1060ef0506eSEric DeVolder
1070ef0506eSEric DeVolderSlot 0 contains a backend storage header that identifies the contents
1080ef0506eSEric DeVolderas ERST and also facilitates efficient access to the records.
1090ef0506eSEric DeVolderDepending upon the size of the backend storage, additional slots will
1100ef0506eSEric DeVolderbe designated to be a part of the slot 0 header. For example, at 8KiB,
111*120f765eSStefan Weilthe slot 0 header can accommodate 1021 records. Thus a storage size
1120ef0506eSEric DeVolderof 8MiB (8KiB * 1024) requires an additional slot for use by the
1130ef0506eSEric DeVolderheader. In this scenario, slot 0 and slot 1 form the backend storage
1140ef0506eSEric DeVolderheader, and records can be stored starting at slot 2.
1150ef0506eSEric DeVolder
1160ef0506eSEric DeVolderBelow is an example layout of the backend storage format (for storage
1170ef0506eSEric DeVoldersize less than 8MiB). The size of the storage is a multiple of 8KiB,
1180ef0506eSEric DeVolderand contains N number of slots to store records. The example below
1190ef0506eSEric DeVoldershows two records (in CPER format) in the backend storage, while the
1200ef0506eSEric DeVolderremaining slots are empty/available.
1210ef0506eSEric DeVolder
1220ef0506eSEric DeVolder::
1230ef0506eSEric DeVolder
1240ef0506eSEric DeVolder Slot   Record
1250ef0506eSEric DeVolder        <------------------ 8KiB -------------------->
1260ef0506eSEric DeVolder        +--------------------------------------------+
1270ef0506eSEric DeVolder    0   | storage header                             |
1280ef0506eSEric DeVolder        +--------------------------------------------+
1290ef0506eSEric DeVolder    1   | empty/available                            |
1300ef0506eSEric DeVolder        +--------------------------------------------+
1310ef0506eSEric DeVolder    2   | CPER                                       |
1320ef0506eSEric DeVolder        +--------------------------------------------+
1330ef0506eSEric DeVolder    3   | CPER                                       |
1340ef0506eSEric DeVolder        +--------------------------------------------+
1350ef0506eSEric DeVolder  ...   |                                            |
1360ef0506eSEric DeVolder        +--------------------------------------------+
1370ef0506eSEric DeVolder    N   | empty/available                            |
1380ef0506eSEric DeVolder        +--------------------------------------------+
1390ef0506eSEric DeVolder
1400ef0506eSEric DeVolderThe storage header consists of some basic information and an array
1410ef0506eSEric DeVolderof CPER record_id's to efficiently access records in the backend
1420ef0506eSEric DeVolderstorage.
1430ef0506eSEric DeVolder
1440ef0506eSEric DeVolderAll fields in the header are stored in little endian format.
1450ef0506eSEric DeVolder
1460ef0506eSEric DeVolder::
1470ef0506eSEric DeVolder
1480ef0506eSEric DeVolder  +--------------------------------------------+
1490ef0506eSEric DeVolder  | magic                                      | 0x0000
1500ef0506eSEric DeVolder  +--------------------------------------------+
1510ef0506eSEric DeVolder  | record_offset        | record_size         | 0x0008
1520ef0506eSEric DeVolder  +--------------------------------------------+
1530ef0506eSEric DeVolder  | record_count         | reserved | version  | 0x0010
1540ef0506eSEric DeVolder  +--------------------------------------------+
1550ef0506eSEric DeVolder  | record_id[0]                               | 0x0018
1560ef0506eSEric DeVolder  +--------------------------------------------+
1570ef0506eSEric DeVolder  | record_id[1]                               | 0x0020
1580ef0506eSEric DeVolder  +--------------------------------------------+
1590ef0506eSEric DeVolder  | record_id[...]                             |
1600ef0506eSEric DeVolder  +--------------------------------------------+
1610ef0506eSEric DeVolder  | record_id[N]                               | 0x1FF8
1620ef0506eSEric DeVolder  +--------------------------------------------+
1630ef0506eSEric DeVolder
1640ef0506eSEric DeVolderThe 'magic' field contains the value 0x524F545354535245.
1650ef0506eSEric DeVolder
1660ef0506eSEric DeVolderThe 'record_size' field contains the value 0x2000, 8KiB.
1670ef0506eSEric DeVolder
1680ef0506eSEric DeVolderThe 'record_offset' field points to the first record_id in the array,
1690ef0506eSEric DeVolder0x0018.
1700ef0506eSEric DeVolder
1710ef0506eSEric DeVolderThe 'version' field contains 0x0100, the first version.
1720ef0506eSEric DeVolder
1730ef0506eSEric DeVolderThe 'record_count' field contains the number of valid records in the
1740ef0506eSEric DeVolderbackend storage.
1750ef0506eSEric DeVolder
1760ef0506eSEric DeVolderThe 'record_id' array fields are the 64-bit record identifiers of the
1770ef0506eSEric DeVolderCPER record in the corresponding slot. Stated differently, the
1780ef0506eSEric DeVolderlocation of a CPER record_id in the record_id[] array provides the
1790ef0506eSEric DeVolderslot index for the corresponding record in the backend storage.
1800ef0506eSEric DeVolder
1810ef0506eSEric DeVolderNote that, for example, with a backend storage less than 8MiB, slot 0
1820ef0506eSEric DeVoldercontains the header, so the record_id[0] will never contain a valid
1830ef0506eSEric DeVolderCPER record_id. Instead slot 1 is the first available slot and thus
1840ef0506eSEric DeVolderrecord_id_[1] may contain a CPER.
1850ef0506eSEric DeVolder
1860ef0506eSEric DeVolderA 'record_id' of all 0s or all 1s indicates an invalid record (ie. the
1870ef0506eSEric DeVolderslot is available).
1880ef0506eSEric DeVolder
1890ef0506eSEric DeVolder
1900ef0506eSEric DeVolderReferences
1910ef0506eSEric DeVolder----------
1920ef0506eSEric DeVolder
1930ef0506eSEric DeVolder[1] "Advanced Configuration and Power Interface Specification",
1940ef0506eSEric DeVolder    version 4.0, June 2009.
1950ef0506eSEric DeVolder
1960ef0506eSEric DeVolder[2] "Unified Extensible Firmware Interface Specification",
1970ef0506eSEric DeVolder    version 2.1, October 2008.
1980ef0506eSEric DeVolder
199*120f765eSStefan Weil[3] "Windows Hardware Error Architecture", specifically
2000ef0506eSEric DeVolder    "Error Record Persistence Mechanism".
201