| 7c410e29 | 11-Feb-2026 |
Daniel Osawa <dosawa@nvidia.com> |
nvidia-events: Hex encoding and schema alignment
Wrap eventInfo and context data under variant keys (cpu/gpu, opaque/type1-4/gpuMetadata/gpuLegacyXid/gpuRecommendedActions) to match the Redfish XML
nvidia-events: Hex encoding and schema alignment
Wrap eventInfo and context data under variant keys (cpu/gpu, opaque/type1-4/gpuMetadata/gpuLegacyXid/gpuRecommendedActions) to match the Redfish XML schema's wrapper pattern used by all other oneOf types in libcper.
Encode hardware identifiers and opaque fields as hex strings: - eventHeader: type, subtype, linkId - eventInfo.cpu: Ecid1-4, InstanceBase - eventInfo.gpu: Pdi (0x hex), EventOriginator (raw/value dict) - data.type1/2/3/4: key and value fields - data.gpuMetadata: configuration, pdi (MAC-style XX:XX:...), architectureId (decomposed raw + architecture name), pciInfo fields (class/subclass/rev/vendorId/deviceId/etc) - data.gpuRecommendedActions: flags - data.opaque: flattened to direct hex string
Fix GPU context JSON schemas (gpuMetadata, gpuLegacyXid, gpuRecommendedActions) to match their C struct definitions.
Change get_value_hex_16/32/64 signatures to void* to handle packed struct member addresses safely with clang.
Add GPU init and UCE ECC example CPER pairs with unit tests.
Change-Id: I30602e6a34c18bf511f5eefa5cb1dd7707b2c3b0 Signed-off-by: Daniel Osawa <dosawa@nvidia.com>
show more ...
|
| 529d0817 | 06-Feb-2026 |
Daniel Osawa <dosawa@nvidia.com> |
nvidia-events: Decode Architecture field in CPU event info
Decode the 32-bit Architecture field into its component bit fields: - hidFam (bits 3:0): Hardware ID family - majorRev (bits 7:4): Major re
nvidia-events: Decode Architecture field in CPU event info
Decode the 32-bit Architecture field into its component bit fields: - hidFam (bits 3:0): Hardware ID family - majorRev (bits 7:4): Major revision - chipId (bits 15:8): Chip ID - minorRev (bits 19:16): Minor revision - preSiPlatform (bits 24:20): Silicon/PreSilicon indicator - einjTag (bit 31): Error injection tag
Change-Id: I7d3030bf5c9c064a3ee0fdbcac65b962ea4a12c6 Signed-off-by: Daniel Osawa <dosawa@nvidia.com>
show more ...
|
| 51c18132 | 26-Nov-2025 |
Daniel Osawa <dosawa@nvidia.com> |
Add NVIDIA Event CPER section support
Add parsing and generation for NVIDIA Event error sections, including: - CPU and GPU device-specific event info - Multiple context data formats (key-value pairs
Add NVIDIA Event CPER section support
Add parsing and generation for NVIDIA Event error sections, including: - CPU and GPU device-specific event info - Multiple context data formats (key-value pairs, opaque, GPU metadata, legacy XID, recommended actions) - JSON schema specifications - Example files and tests
Change-Id: Ibf66e2e4263014c2157958acf2f6158361fc6866 Signed-off-by: Daniel Osawa <dosawa@nvidia.com> Signed-off-by: Ed Tanous <etanous@nvidia.com>
show more ...
|
| e1cba52d | 18-Sep-2025 |
Prachotan Bathi <prachotan.bathi@arm.com> |
cper-section-arm-ras: Support Arm RAS System Architecture node CPER
The DEN0085 - Arm ACPI for the Armv8-A RAS Extension and RAS System Architecture v2.0 specification, section 4, defines additional
cper-section-arm-ras: Support Arm RAS System Architecture node CPER
The DEN0085 - Arm ACPI for the Armv8-A RAS Extension and RAS System Architecture v2.0 specification, section 4, defines additional standard CPER records for Arm RAS architecture. https://developer.arm.com/documentation/den0085/latest/
Added section definitions and generator to generate an example cper with one descriptor. Generate using: ./cper-generate --out cper.generated.dump --sections arm-ras-node
Signed-off-by: Prachotan Bathi <prachotan.bathi@arm.com> Signed-off-by: Ed Tanous <etanous@nvidia.com> Change-Id: Ic7fa68a6c584c537a3dc2c4b17795dd7ba3b3f8c
show more ...
|
| 043d5f4b | 17-Oct-2025 |
Erwin Tsaur <etsaur@nvidia.com> |
ARM CPER: Decode ErrorType as bit values
ErrorInformation.ErrorType needs to be decoded as bit values instead of as an integer.
Change-Id: Iee09eb6e62561620d0903fea1ae4d6ed35898445 Signed-off-by: E
ARM CPER: Decode ErrorType as bit values
ErrorInformation.ErrorType needs to be decoded as bit values instead of as an integer.
Change-Id: Iee09eb6e62561620d0903fea1ae4d6ed35898445 Signed-off-by: Erwin Tsaur <etsaur@nvidia.com>
show more ...
|
| da75128c | 28-Jul-2025 |
Peter Benitez <pbenitez@nvidia.com> |
cper-section-memory: Fix validation dependency for Extended field bits
Fixed incorrect dependency between validation bits 16, 17, and 18 for the Extended field. Previously, cardSmbiosHandle (validat
cper-section-memory: Fix validation dependency for Extended field bits
Fixed incorrect dependency between validation bits 16, 17, and 18 for the Extended field. Previously, cardSmbiosHandle (validation bit 16) and moduleSmbiosHandle (validation bit 17) were incorrectly made dependent on the Extended field validation (bit 18), but these are independent components.
Validation bit 18 controls the Extended field containing row address bits 16 and 17, while validation bits 16 and 17 control SMBIOS handle fields. These SMBIOS handle fields are independent components that should be validated separately from the Extended field's row address bits.
Tested: Added memory-validation-bits unit test
Change-Id: I9461c71bf0b782bda74ed24c95b63c080f913b19 Signed-off-by: Peter Benitez <pbenitez@nvidia.com>
show more ...
|
| ad6c880f | 18-Jun-2025 |
Aushim Nagarkatti <anagarkatti@nvidia.com> |
Support to stringify CPER output
Initial commit to add a "message" property that provides a single line description of some important properties. This makes it easier to parse multiple CPERs in crow
Support to stringify CPER output
Initial commit to add a "message" property that provides a single line description of some important properties. This makes it easier to parse multiple CPERs in crowded logs.
For now, "message" is supported for nvidia, arm processor and memory types. The other types contain generic messages.
Example output:
``` "sections":[ { "message":"A Corrected CCPLEXSCF NVIDIA Error occurred on CPU 0", "Nvidia":{ "signature":"CCPLEXSC",
"sections":[ { "message":"An ARM Processor Error occurred on CPU 0; Error Type(s): {Cache Error at Virtual Addr=0x41D6AA12D528 Physical Addr=0x80003A198DDA10}", "ArmProcessor":{ "errorInfoNum":1,
"sections":[ { "message":"A Multi-bit ECC Memory Error occurred at address 0x0000000080000000 at node 0", ```
Change-Id: I395d0370ec60579b8f7fede825b45a3ced8ff18f Signed-off-by: Aushim Nagarkatti <anagarkatti@nvidia.com>
show more ...
|
| eda19ff0 | 10-Jun-2025 |
Aushim Nagarkatti <anagarkatti@nvidia.com> |
Rename PCIe properties
For CSDL compatibility property names shouldn't begin with underscores or digits. Fix PCIe names.
Change-Id: I6a801e26550320f808a2cac2d91f8bd913a0eabf Signed-off-by: Aushim N
Rename PCIe properties
For CSDL compatibility property names shouldn't begin with underscores or digits. Fix PCIe names.
Change-Id: I6a801e26550320f808a2cac2d91f8bd913a0eabf Signed-off-by: Aushim Nagarkatti <anagarkatti@nvidia.com>
show more ...
|
| ffa7e17d | 29-May-2025 |
Ed Tanous <etanous@nvidia.com> |
Add schema for PCIe aerInfo
A few commits ago, we punted and didn't include a schema for aerinfo. This commit reenables the json schema, and corrects the config for the PCIe error fields.
There are
Add schema for PCIe aerInfo
A few commits ago, we punted and didn't include a schema for aerinfo. This commit reenables the json schema, and corrects the config for the PCIe error fields.
There are certain objects that have zero properties. These are commented out temporarily to ensure that we don't have empty objects in the output, which would confuse users.
Change-Id: Id756cd90348cd77a1647c2781a6ce26e7d9a3485 Signed-off-by: Ed Tanous <etanous@nvidia.com>
show more ...
|
| 55968b12 | 06-May-2025 |
Ed Tanous <ed@tanous.net> |
Nvidia add cmet-info
Add decoding of more specific Error codes.
Unit tests pass.
Change-Id: Ia0ca0dfdf550381da435b0fb9041b664784f7476 Signed-off-by: Ed Tanous <etanous@nvidia.com> |
| 8870c074 | 28-Feb-2025 |
Erwin Tsaur <etsaur@nvidia.com> |
PCIe CPER Section Enhancement
This commit improves PCIe error reporting capabilities by: - Adding support for PCIe capability version detection and parsing - Expanding Advanced Error Reporting infor
PCIe CPER Section Enhancement
This commit improves PCIe error reporting capabilities by: - Adding support for PCIe capability version detection and parsing - Expanding Advanced Error Reporting information extraction
The changes include: - New capability_registers structure to decode PCIe capability registers - Updated PCIe JSON Schema to match - Support for PCIe 2.0+ extended registers when detected - Improved error source identification and root error status reporting - Fix typo for Advanced Error Reporting capabilit[i]es_control - Updated generate/gen-section-pcie.c and pcie.json example
In the future we could: - Implement TLP header log parsing with detailed descriptions - Add support for Flit mode in PCIe 2.0+ devices
Tested: - test/cper-tests passes - cper-convert to-json|to-cper on pcie.cper|json in example path - Tested "cper-convert to-json-section" using an extracted OS GHES PCIE CPER from error injection and compare against expected values
Note, schema validation is intentionally less restrictive than it could be for pcie advanced error reporting as it evolves.
Change-Id: Ifebb9d97d28a3a487a0aab53bf9e757afeedd64a Signed-off-by: Erwin Tsaur <etsaur@nvidia.com> Signed-off-by: Ed Tanous <etanous@nvidia.com>
show more ...
|
| bd1814de | 31-Mar-2025 |
Khang D Nguyen <khangng@os.amperecomputing.com> |
Ensure FRU text is printable ASCII
Currently, libcper currently fails to compile on my machine (GCC 13):
../cper-utils.c: In function ‘add_untrusted_string’: ../cper-utils.c:467:23: error:
Ensure FRU text is printable ASCII
Currently, libcper currently fails to compile on my machine (GCC 13):
../cper-utils.c: In function ‘add_untrusted_string’: ../cper-utils.c:467:23: error: comparison is always false due to limited range of data type [-Werror=type-limits] 467 | if (c < 0) { | ^
The reason seems to be that char signedness is implementation-defined, we have to explicitly use unsigned char or signed char to get a portable char type. In our case, char is unsigned char, hence the warning.
Apparently we are trying to validate ASCII strings from the records. Those strings seem to be used for display purpose only, so I think replacing it with a more precise printable ASCII test, which also does not care about char signedness, is appropriate here.
This changes the JSON fruText property to appear only with printable ASCII FRU content. As a result, all of the examples have been changed where applicable. Some sections use FRU content with a predefined format (pcie, cxlprotocol) so fruText has been completely removed from those JSON objects like in the case of non-printable ASCII FRU content.
Tested: oompile successfully
Change-Id: I98c7c10a674c8817e0b2cbe82c26f6590d8d716a Signed-off-by: Khang D Nguyen <khangng@os.amperecomputing.com>
show more ...
|
| a2dce4bc | 05-Mar-2025 |
Ed Tanous <etanous@nvidia.com> |
Convert files to hex
It was pointed out in code review these files would be easier to review diffs on if they were in hex format on disk. This commit converts all the existing files to "cperhex" wh
Convert files to hex
It was pointed out in code review these files would be easier to review diffs on if they were in hex format on disk. This commit converts all the existing files to "cperhex" which is cper in hex hexadecimal format using the command 'xxd -p -l 64'
Change-Id: I5e762ec27a02b3d918b926a966074da8178d73b8 Signed-off-by: Ed Tanous <etanous@nvidia.com>
show more ...
|
| 67018a26 | 18-Feb-2025 |
Aushim Nagarkatti <anagarkatti@nvidia.com> |
Add randomly generated CPER examples for Unit Tests
These examples of CPER blobs and their outputs will be used to validate CPER binaries against their JSON output.
Unit tests to be overhauled to u
Add randomly generated CPER examples for Unit Tests
These examples of CPER blobs and their outputs will be used to validate CPER binaries against their JSON output.
Unit tests to be overhauled to use valijson in a subsequent patch.
Change-Id: I51cc00df22b043fcd71a8cc3ae79bfebb53e66d9 Signed-off-by: Aushim Nagarkatti <anagarkatti@nvidia.com>
show more ...
|