xref: /openbmc/docs/designs/cper-records.md (revision f167b6d2)
1# CPER records - CPER
2
3Author: Ed Tanous - edtanous
4
5Created: 5-22-2024
6
7## Problem Description
8
9Server CPUs expose a managability interface refered to as CPER records. A user
10outside of the BMC would like to read these records in a decoded state, rather
11than as a raw package.
12
13## Background and References
14
15CPER stands for Common Platform Error Record and is defined as an industry
16standard in the [UEFI Specification][uefi_spec], with CPU ISA specific
17definitions for aarch64, x64, Itanium, and IA32. Within this document there are
18several architecture definitions for BMC, including section C.2.2 RAS IPMI
19Message Format, and IPMI based RAS Event Receiver.
20
21In Redfish specification drop 2021.3, Redfish added support for CPER records
22into the [LogEntry resource][logentry]. These expose a section by section
23decoded CPER instance. In addition to Redfish, there is a proposed DMTF
24interface for sending CPER log events to the BMC using [MCTP/PLDM][cperevent],
25which is proposed to be added in a future version of DMTF [DSP0248].
26
27ARM has developed a reference library for decoding CPER records that does not
28have a contribution mechanism, releases, or maintenance, and they have made
29[clear][cper_examples] that they would like OpenBMC to be the long-term
30custodian of this library.
31
32This library hosts the meson-dev branch, which was added for the purpose of this
33design, and passes the OpenBMC CI tests currently. This is the proposed branch
34that will be pushed to openbmc/libcper, if approved.
35
36## Requirements
37
38- A BMC should be able to decode binary CPER records originated from a CPER
39  compatible CPU.
40
41- BMC should be able to recieve and decode CPER records from a CPU per the [CPER
42  specification][arm_sbmr].
43
44- A BIOS/EDK2 build should be able to share decoding code with OpenBMC, to the
45  end that added records do not require manual effort to implement in each
46  codebase.
47
48- A CPU vendor should be able to add support for CPER extensions that OpenBMC
49  will now be able to decode, without impacting users of other vendors, as
50  promised in the CPER specification.
51
52- CPER decoder should allow decoding of multiple CPU complexes.
53
54## Proposed Design
55
56While this design fits into a much more elaborate design alluded to in the
57aformentioned ARM document, this document only requests the first step, creating
58a shared library implementation within the OpenBMC organization that can be
59built upon over time, but might not implement the complete implementation at
60this time. It is expected that the ubiquity of CPER records in the BMC ecosystem
61justifies the creation of the repository, even if the initial implementation
62might not meet all design goals for all contributors, having a common
63contribution model, CI testing, and license is beneficial as a whole.
64
65Future design docs (or amendments to this design) will iterate on implementing
66more of the design referenced in this [CPER specification][arm_sbmr], for common
67ARM platforms, but getting the custody transferred for the libcper repo, getting
68the quality up to standards is the initial goal of this design.
69
70## Alternatives Considered
71
72Rewrite libcper decoding from a new design point. While this is certainly
73possible given the small size of the libcper repo as it exists today, it would
74bifurcate already existing implementations of the decode.
75
76## Impacts
77
78New repo will be created within the organization. New recipe will be added to
79OpenBMC.
80
81### Organizational
82
83- Does this repository require a new repository? Yes
84- Who will be the initial maintainer(s) of this repository? Ed Tanous
85- Which repositories are expected to be modified to execute this design?
86
87  - openbmc/openbmc
88  - openbmc/libcper
89
90#### Potentionally in the future
91
92- openbmc/phosphor-debug-collector
93- openbmc/bmcweb
94- openbmc/phosphor-logging
95
96## Testing
97
98Unit tests are already present in the repo to verify basic functionality.
99CPU-model specific error generators will be used to simulate the full path, once
100design is complete.
101
102[arm_sbmr]: https://developer.arm.com/documentation/den0069/latest/
103[uefi_spec]: https://uefi.org/specifications
104[logentry]:
105  https://github.com/DMTF/Redfish-Publications/blob/5b217908b5378b24e4f390c063427d7a707cd308/csdl/LogEntry_v1.xml#L1403
106[cperevent]:
107  https://www.dmtf.org/sites/default/files/PMCI_CPEREvent_Proposal_v3.pdf
108[DSP0248]: https://www.dmtf.org/dsp/DSP0248
109[cper_examples]:
110  https://gitlab.arm.com/server_management/libcper/-/blob/b8b687c2e05846afd37b60222a0b4253acda81fd/README.md#usage-examples
111